重点サンプリングを用いた Ga による強化学習

Kimura Hajime Tsuchiya Chikao

Download from

dx.doi.org

More download options

重点サンプリングを用いた Ga による強化学習

Kimura Hajime Tsuchiya Chikao

Transactions of the Japanese Society for Artificial Intelligence 20:1-10 (2005) Copy BIBT_EX

Abstract

Reinforcement Learning (RL) handles policy search problems: searching a mapping from state space to action space. However RL is based on gradient methods and as such, cannot deal with problems with multimodal landscape. In contrast, though Genetic Algorithm (GA) is promising to deal with them, it seems to be unsuitable for policy search problems from the viewpoint of the cost of evaluation. Minimal Generation Gap (MGG), used as a generation-alternation model in GA, generates many offspring from two or more parents selected from a population. Therefore, evaluating policies of generated offspring requires much trial and error (i.e. interaction between an agent and an environment). In this paper, we incorporate importance sampling into the framework of MGG in order to reduce the cost of evaluation on policy search. The proposed techniques are applied to Markov Decision Process (MDP) with multimodal landscape. The experimental results show that these techniques can reduce the number of interaction between an agent and an environment, and also mean that MGG and importance sampling are good for each other.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Keywords

GA, reinforcement learning, direct policy search, importance sampling

Reprint years

DOI

10.1527/tjsai.20.1

My notes

Similar books and articles

Saving MGG: 実数値 GA/MGG における適応度評価回数の削減.Tsuchiya Chikao Tanaka Masaharu - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21 (6):547-555.

最適解の位置にロバストな実数値 GA を実現する Toroidal Search Space Conversion の提案.Yamamura Masayuki Someya Hiroshi - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (3):333-343.

認知距離学習による問題解決器の実行時探索削減の評価と学習プロセスの解析.宮本裕司山川宏 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:1-13.

強化学習ロボットによるアフォーダンスの利用.公文誠李銘義 - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (1):152.

強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―.輿石尚宏謙吾片山 - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:279-291.

距離に依存せずに多様性を制御する Ga による高次元関数最適化.Konagaya Akihiko Kimura Shuhei - 2003 - Transactions of the Japanese Society for Artificial Intelligence 18:193-202.

実数値 Ga におけるサンプリングバイアスを考慮した外挿的交叉 Edx.Kobayashi Shigenobu Sakuma Jun - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:699-707.

強化学習を用いた自律移動型ロボットの行動計画法の提案.五十嵐治一 - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16:501-509.

Ga により探索空間の動的生成を行う Q 学習.Matsuno Fumitoshi Ito Kazuyuki - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16:510-520.

免疫系を用いた遺伝的プログラミングによる多峰性探索.伊庭斉志長谷川禎彦 - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21:176-183.

Analytics

Added to PP
2014-03-20

Downloads
25 (#654,840)

6 months
7 (#491,177)

Historical graph of downloads

How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

重点サンプリングを用いた Ga による強化学習

Abstract

Categories

Keywords

Reprint years

DOI

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Citations of this work

References found in this work