確率的2分木の行動選択を用いたActor-Criticアルゴリズム

書誌事項

タイトル別名
  • An Actor-Critic Algorithm Using a Binary Tree Action Selector
  • 確率的2分木の行動選択を用いたActor-Criticアルゴリズム:多数の行動を扱う強化学習
  • カクリツテキ 2ブンギ ノ コウドウ センタク オ モチイタ Actor Critic アルゴリズム タスウ ノ コウドウ オ アツカウ キョウカ ガクシュウ
  • Reinforcement Learning to Cope with Enormous Actions
  • 多数の行動を扱う強化学習

この論文をさがす

抄録

In real world applications, learning algorithms often have to handle several dozens of actions, which have some distance metrics. Epsilon-greedy or Boltzmann distribution exploration strategies, which have been applied for Q-learning or SARSA, are very popular, simple and effective in the problems that have a few actions, however, the efficiency would decrease when the number of actions is increased. We propose a policy function representation that consists of a stochastic binary decision tree, and we apply it to an actor-critic algorithm for the problems that have enormous similar actions. Simulation results show the increase of the actions does not affect learning curves of the proposed method at all.

収録刊行物

被引用文献 (3)*注記

もっと見る

参考文献 (12)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ