確率的2分木の行動選択を用いたActor-Criticアルゴリズム

木村 元, 小林 重信

doi:10.9746/sicetr1965.37.1147

書誌事項

タイトル別名

An Actor-Critic Algorithm Using a Binary Tree Action Selector
確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習
カクリツテキ 2ブンギノコウドウセンタクオモチイタ Actor Critic アルゴリズムタスウノコウドウオアツカウキョウカガクシュウ
Reinforcement Learning to Cope with Enormous Actions
多数の行動を扱う強化学習

この論文をさがす

抄録

In real world applications, learning algorithms often have to handle several dozens of actions, which have some distance metrics. Epsilon-greedy or Boltzmann distribution exploration strategies, which have been applied for Q-learning or SARSA, are very popular, simple and effective in the problems that have a few actions, however, the efficiency would decrease when the number of actions is increased. We propose a policy function representation that consists of a stochastic binary decision tree, and we apply it to an actor-critic algorithm for the problems that have enormous similar actions. Simulation results show the increase of the actions does not affect learning curves of the proposed method at all.

収録刊行物

計測自動制御学会論文集

計測自動制御学会論文集 37 (12), 1147-1155, 2001

公益社団法人計測自動制御学会

キーワード

詳細情報詳細情報について

CRID: 1390282679477869056

NII論文ID: 130003970998; 10007403471

NII書誌ID: AN00072392

DOI: 10.9746/sicetr1965.37.1147

ISSN: 18838189; 04534654; http://id.crossref.org/issn/04534654

NDL書誌ID: 6020326

Web Site: https://ndlsearch.ndl.go.jp/books/R000000004-I6020326; https://www.jstage.jst.go.jp/article/sicetr1965/37/12/37_12_1147/_pdf

データソース種別

JaLC
NDL
Crossref
CiNii Articles

抄録ライセンスフラグ: 使用不可

確率的2分木の行動選択を用いたActor-Criticアルゴリズム

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (3)*注記

参考文献 (12)*注記

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

確率的2分木の行動選択を用いたActor-Criticアルゴリズム

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (3)*注記

参考文献 (12)*注記

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について