A Meta-Parameter Learning Method in Reinforcement Learning Based on Temporal Difference Error
-
- Mizoue Hiroyuki
- Graduate School of Science and Engineering, Yamaguchi University
-
- Kobayashi Kunikazu
- Graduate School of Science and Engineering, Yamaguchi University
-
- Kuremoto Takashi
- Graduate School of Science and Engineering, Yamaguchi University
-
- Obayashi Masanao
- Graduate School of Science and Engineering, Yamaguchi University
Bibliographic Information
- Other Title
-
- TD誤差に基づく強化学習のメタパラメータ学習法
- TD ゴサ ニ モトズク キョウカ ガクシュウ ノ メタパラメータ ガクシュウホウ
Search this article
Abstract
In general, meta-parameters in a reinforcement learning system such as learning rate are empirically determined and fixed during the learning. Therefore, when an external environment has changed, the sytem cannot adjust to the change. Meanwhile, it is suggested that the biological brain could conduct reinforcement learning and adjust to the external environment by controlling neuromodulators corresponding to meta-parameters. In the present paper, based on the above suggestion, a method to adjust meta-parameters using the TD-error is proposed. Through computer simulations using maze problem and inverted pendulum control problem, it is verified that meta-parameters are appropriately adjusted according to the amplitude of the TD-error.
Journal
-
- IEEJ Transactions on Electronics, Information and Systems
-
IEEJ Transactions on Electronics, Information and Systems 129 (9), 1730-1736, 2009
The Institute of Electrical Engineers of Japan
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390282679582797824
-
- NII Article ID
- 10025102012
-
- NII Book ID
- AN10065950
-
- ISSN
- 13488155
- 03854221
-
- NDL BIB ID
- 10421449
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed