失敗確率伝播アルゴリズムEFPAの提案とマルチエージェント環境下での有効性の検証

書誌事項

タイトル別名
  • Proposal of a Propagation Algorithm of the Expected Failure Probability and the Effectiveness on Multi-agent Environments
  • シッパイ カクリツ デンパ アルゴリズム EFPA ノ テイアン ト マルチエージェント カンキョウ カ デ ノ ユウコウセイ ノ ケンショウ

この論文をさがす

抄録

It is known that Improved Penalty Avoiding Rational Policy Making algorithm (IPARP) can learn policies by a reward and a penalty. IPARP aims to identify penalty rules that have a high possibility to receive a penalty. Though IPARP is effective in many cases, it needs many trial-and-error searches due to memory constraints. In this paper, we propose a method called Expected Failure Probability Algorithm (EFPA) to speed it up. In addition, we extend EFPA to multi-agent environments. In multi-agent learning, it is important to avoid concurrent learning problem that occurs when multiple agents learn simultaneously. We also propose a method to avoid the problem and confirm the effectiveness by numerical experiments.

収録刊行物

被引用文献 (3)*注記

もっと見る

参考文献 (16)*注記

もっと見る

関連プロジェクト

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ