Analysis of Distributed Thompson Sampling based on Consensus Control

Bibliographic Information

Other Title
  • 合意制御に基づく協調型トンプソン抽出の検討
  • ゴウイ セイギョ ニ モトズク キョウチョウガタ トンプソン チュウシュツ ノ ケントウ

Search this article

Abstract

<p>Recently, distributed control for multi-agent systems has attracted much attention. Each agent makes a decision through interaction over a communication network. In general, there exists a trade-off between exploration of the best choice and exploitation of the obtained knowledge. Such a trade-off can be formulated as the bandit problem. In this paper, we investigate a distributed bandit problem where a group of agents cooperatively searches the best choice in a distributed manner. We propose a cooperative Thompson sampling based on the consensus algorithm of multi-agent systems. The theoretical analysis of a regret bound is carried out for the case when the communication network is represented by a complete graph. The numerical examples show that the regret can be reduced by the proposed cooperative Thompson sampling compared to the case when agents individually search the best choice without cooperation.</p>

Journal

References(3)*help

See more

Related Projects

See more

Details 詳細情報について

Report a problem

Back to top