強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―

謙吾 片山, 輿石 尚宏, 成久 洋之

doi:10.1527/tjsai.19.279

Reinforcement Learning (RL) is a promising technique for creating agents that can be applied to real world problems. The most important features of RL are trial-and-error search and delayed reward. Thus, agents randomly act in the early learning stage. However, such random actions are impractical for real world problems. <p> This paper presents a novel model of RL agents. A feature of our learning agent model is to integrate the Analytic Hierarchy Process (AHP) into the standard RL agent model, which consists of three modules: state recognition, learning, and action selecting modules. In our model, the AHP module is designed with {\\it primary knowledge} that humans intrinsically have in a process until a goal state is attained. This integration aims at increasing promising actions instead of completely random actions in the standard RL algorithms. <p> Profit Sharing (PS) is adopted as a RL method for our model, since PS is known to be useful even in multi-agent environments. To evaluate our approach in a multi-agent environment, we test a PS RL method with our agent model on a pursuit problem in a grid world. Computational results show that our approach outperforms the standard PS in terms of learning speed in the earlier stages of learning. We also show that the learning performance of our approach is superior at least competitive to that of the standard one in the final stages of learning.<p>

強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (6)*注記

参考文献 (25)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (6)*注記

参考文献 (25)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について