エージェントの行動履歴を活用したQ-learningアルゴリズムの提案

齋藤 雅矩, 瀬古沢 照治

doi:10.1541/ieejeiss.136.1209

書誌事項

タイトル別名

Improving Q-learning by Using the Agent's Action History
エージェントの行動履歴を活用したQ-learningアルゴリズムの提案
エージェントノコウドウリレキオカツヨウシタ Q-learning アルゴリズムノテイアン

この論文をさがす

抄録

Q-learning is learning the optimal policy by updating in action-state value function(Q-value) to maximize a expectation reward by a trial and error search. However, there is major issues slowness of learning speed. Therefore, we added technique agent memorize environmental information and useing with update of the Q-value in many states. By updating the Q-value in the number of conditions to give a lot of information to the agent, be able to reduce learning time. Further, by incorporating the stored environmental information into action selection method, and the action selection to avoid the failure behavior, such as learning to stagnation, improved the learning speed of learning the initial stage. In addition, we design a new action area value function, in order to search for much more statas from the learning initial. Finally, numerical examples which solved maze problem showed the usefulness of the proposed method.

収録刊行物

電気学会論文誌Ｃ（電子・情報・システム部門誌）

電気学会論文誌Ｃ（電子・情報・システム部門誌） 136 (8), 1209-1217, 2016

一般社団法人電気学会

キーワード

詳細情報詳細情報について

CRID: 1390282679583936384

NII論文ID: 130005254603

NII書誌ID: AN10065950

DOI: 10.1541/ieejeiss.136.1209

ISSN: 13488155; 03854221

NDL書誌ID: 027601872

Web Site: https://ndlsearch.ndl.go.jp/books/R000000004-I027601872; https://www.jstage.jst.go.jp/article/ieejeiss/136/8/136_1209/_pdf

本文言語コード: ja

データソース種別

JaLC
NDL
Crossref
CiNii Articles

抄録ライセンスフラグ: 使用不可

エージェントの行動履歴を活用した<i>Q</i>-learningアルゴリズムの提案

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (6)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

エージェントの行動履歴を活用した<i>Q</i>-learningアルゴリズムの提案

書誌事項

この論文をさがす

抄録

収録刊行物

参考文献 (6)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について