学習進度を反映した割引率の調整  [in Japanese] Adjustment of Discount Rate Using Index for Progress of Learning  [in Japanese]

Search this Article

Author(s)

Abstract

強化学習における割引率を学習進度によって調整することの有用性を示す.学習進度が浅いときには割引率を下げて即時報酬を重視し,学習が進むにつれて次第に割引率を大きくして,将来の報酬も考慮していくという戦略を提案する.また,学習進度の調整法として,指数的調整,TD誤差による調整,信頼度による調整を提案する.これをwindy gridworld 課題により検証する.

We show that it can be effective to adjust the discount rate using an index for progress of learning. In the strategy that we propose, the discount rate is small when the learning does not progress enough, and is increased as the learning advances. We also propose three methods for its adjustment ; exponential, by TD error, and by reliability, which are verificated by numerical experiments for a windy gridworld task.

Journal

  • IEICE technical report. Neurocomputing

    IEICE technical report. Neurocomputing 102(628), 73-78, 2003-01-28

    The Institute of Electronics, Information and Communication Engineers

References:  17

Cited by:  7

Codes

  • NII Article ID (NAID)
    110003232277
  • NII NACSIS-CAT ID (NCID)
    AN10091178
  • Text Lang
    JPN
  • Article Type
    Journal Article
  • ISSN
    09135685
  • NDL Article ID
    6505500
  • NDL Source Classification
    ZN33(科学技術--電気工学・電気機械工業--電子工学・電気通信)
  • NDL Call No.
    Z16-940
  • Data Source
    CJP  CJPref  NDL  NII-ELS 
Page Top