書誌事項
- タイトル別名
-
- An Actor-Critic Algorithm Using a Binary Tree Action Selector
- 確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習
- カクリツテキ 2ブンギ ノ コウドウ センタク オ モチイタ Actor Critic アルゴリズム タスウ ノ コウドウ オ アツカウ キョウカ ガクシュウ
- Reinforcement Learning to Cope with Enormous Actions
- 多数の行動を扱う強化学習
この論文をさがす
抄録
In real world applications, learning algorithms often have to handle several dozens of actions, which have some distance metrics. Epsilon-greedy or Boltzmann distribution exploration strategies, which have been applied for Q-learning or SARSA, are very popular, simple and effective in the problems that have a few actions, however, the efficiency would decrease when the number of actions is increased. We propose a policy function representation that consists of a stochastic binary decision tree, and we apply it to an actor-critic algorithm for the problems that have enormous similar actions. Simulation results show the increase of the actions does not affect learning curves of the proposed method at all.
収録刊行物
-
- 計測自動制御学会論文集
-
計測自動制御学会論文集 37 (12), 1147-1155, 2001
公益社団法人 計測自動制御学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390282679477869056
-
- NII論文ID
- 130003970998
- 10007403471
-
- NII書誌ID
- AN00072392
-
- ISSN
- 18838189
- 04534654
- http://id.crossref.org/issn/04534654
-
- NDL書誌ID
- 6020326
-
- データソース種別
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可