A Meta-Parameter Learning Method in Reinforcement Learning Based on Temporal Difference Error

Bibliographic Information

Other Title
  • TD誤差に基づく強化学習のメタパラメータ学習法
  • TD ゴサ ニ モトズク キョウカ ガクシュウ ノ メタパラメータ ガクシュウホウ

Search this article

Abstract

In general, meta-parameters in a reinforcement learning system such as learning rate are empirically determined and fixed during the learning. Therefore, when an external environment has changed, the sytem cannot adjust to the change. Meanwhile, it is suggested that the biological brain could conduct reinforcement learning and adjust to the external environment by controlling neuromodulators corresponding to meta-parameters. In the present paper, based on the above suggestion, a method to adjust meta-parameters using the TD-error is proposed. Through computer simulations using maze problem and inverted pendulum control problem, it is verified that meta-parameters are appropriately adjusted according to the amplitude of the TD-error.

Journal

References(18)*help

See more

Related Projects

See more

Details 詳細情報について

Report a problem

Back to top