実例に基づく強化学習法

書誌事項

タイトル別名
  • Instance-Based Reinforcement Learning Method

この論文をさがす

抄録

<p>This paper proposes a reinforcement learning method based on an instance-based learning approach. The learning take is assumed as follows. The input on each learning cycle is a vector of real numbers, the output is a symbol selected from a Priori known finite set, and the reinforcement from environment is +1, 0 or -1 usually being 0, that is, in the manner of delayed reinforcement. The last assumption makes it difficult to apply any conventional supervised concept learning schema because the evaluation of its output is not given at every cycle. The key idea is to propagate reinforcement backward through the memorized experiences in the order of time. The learner tends to select the output which is associated with the input similar to current situation and which will likely lead to high positive reinforcement, scanning all of the past experiences stored in memory verbatim. In addition to this basic mechanism, two types of extensions are proposed. The first is to restrict the capacity of memory to avoid infinite increase of time and space complexity, replacing the oldest data by new data in each cycle. The second is to embed a feedback mechanism concerning with reliability of each memorized experience. Reliability of the experience employed to decide the output of nearly previous cycle is increased when the learner gets positive reinforcement, and is decreased when negative reinforcement. Experimental results show these learning algorithms work well for a domain of simulating adaptive behavior, and the extension methods are effective.</p>

収録刊行物

  • 人工知能

    人工知能 7 (4), 697-707, 1992-07-01

    一般社団法人 人工知能学会

被引用文献 (35)*注記

もっと見る

参考文献 (24)*注記

もっと見る

詳細情報 詳細情報について

  • CRID
    1390004222628654976
  • NII論文ID
    110002807614
  • NII書誌ID
    AN10067140
  • DOI
    10.11517/jjsai.7.4_697
  • ISSN
    24358614
    21882266
  • 本文言語コード
    ja
  • データソース種別
    • JaLC
    • CiNii Articles
  • 抄録ライセンスフラグ
    使用不可

問題の指摘

ページトップへ