書誌事項
- タイトル別名
-
- Improving <i>Q</i>-learning by Using the Agent's Action History
- エージェントの行動履歴を活用したQ-learningアルゴリズムの提案
- エージェント ノ コウドウ リレキ オ カツヨウ シタ Q-learning アルゴリズム ノ テイアン
この論文をさがす
抄録
<p>Q-learning is learning the optimal policy by updating in action-state value function(Q-value) to maximize a expectation reward by a trial and error search. However, there is major issues slowness of learning speed. Therefore, we added technique agent memorize environmental information and useing with update of the Q-value in many states. By updating the Q-value in the number of conditions to give a lot of information to the agent, be able to reduce learning time. Further, by incorporating the stored environmental information into action selection method, and the action selection to avoid the failure behavior, such as learning to stagnation, improved the learning speed of learning the initial stage. In addition, we design a new action area value function, in order to search for much more statas from the learning initial. Finally, numerical examples which solved maze problem showed the usefulness of the proposed method.</p>
収録刊行物
-
- 電気学会論文誌C(電子・情報・システム部門誌)
-
電気学会論文誌C(電子・情報・システム部門誌) 136 (8), 1209-1217, 2016
一般社団法人 電気学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390282679583936384
-
- NII論文ID
- 130005254603
-
- NII書誌ID
- AN10065950
-
- ISSN
- 13488155
- 03854221
-
- NDL書誌ID
- 027601872
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可