確率的傾斜法とメモリベース的な手法を組み合わせた強化学習法 A Reinforcement Learning Using a Stochastic Gradient Method with Memory-Based Learning

この論文にアクセスする

この論文をさがす

著者

抄録

In this paper, for agents working on POMDP, a learning algorithm combining the memory-less learning and the memory-based learning is proposed. At first stage of the propposed algorithm, memory-less learning is applied. As a memory-less learning algorithm, the stochastic gradient method is employed. While the first stage, a state-action set series that accmplish the task is stored in memory. In the second stage, the memory-based learning is applied. In this process, only the series that obtained the first stage is used, so that this method is able to reduce the number of required memory effectively.<br>The proposed algorithm are applied three kinds of simulation to be compared with memory-less learning algorithm. Through the computer simulations, it shown that the proposed algorithms works effectively in POMDP than ordinary memory-less learnings.

収録刊行物

  • 電気学会論文誌. C, 電子・情報・システム部門誌 = The transactions of the Institute of Electrical Engineers of Japan. C, A publication of Electronics, Information and System Society  

    電気学会論文誌. C, 電子・情報・システム部門誌 = The transactions of the Institute of Electrical Engineers of Japan. C, A publication of Electronics, Information and System Society 128(7), 1123-1130, 2008-07-01 

    The Institute of Electrical Engineers of Japan

参考文献:  14件

参考文献を見るにはログインが必要です。ユーザIDをお持ちでない方は新規登録してください。

各種コード

  • NII論文ID(NAID)
    10021133129
  • NII書誌ID(NCID)
    AN10065950
  • 本文言語コード
    JPN
  • 資料種別
    ART
  • ISSN
    03854221
  • NDL 記事登録ID
    9564339
  • NDL 雑誌分類
    ZN31(科学技術--電気工学・電気機械工業)
  • NDL 請求記号
    Z16-795
  • データ提供元
    CJP書誌  NDL  J-STAGE 
ページトップへ