Analysis of Distributed Thompson Sampling based on Consensus Control
-
- Kamimura Motoki
- Graduate School of Engineering, Osaka University
-
- Hayashi Naoki
- Graduate School of Engineering, Osaka University
-
- Takai Shigemasa
- Graduate School of Engineering, Osaka University
Bibliographic Information
- Other Title
-
- 合意制御に基づく協調型トンプソン抽出の検討
- ゴウイ セイギョ ニ モトズク キョウチョウガタ トンプソン チュウシュツ ノ ケントウ
Search this article
Abstract
<p>Recently, distributed control for multi-agent systems has attracted much attention. Each agent makes a decision through interaction over a communication network. In general, there exists a trade-off between exploration of the best choice and exploitation of the obtained knowledge. Such a trade-off can be formulated as the bandit problem. In this paper, we investigate a distributed bandit problem where a group of agents cooperatively searches the best choice in a distributed manner. We propose a cooperative Thompson sampling based on the consensus algorithm of multi-agent systems. The theoretical analysis of a regret bound is carried out for the case when the communication network is represented by a complete graph. The numerical examples show that the regret can be reduced by the proposed cooperative Thompson sampling compared to the case when agents individually search the best choice without cooperation.</p>
Journal
-
- Transactions of the Institute of Systems, Control and Information Engineers
-
Transactions of the Institute of Systems, Control and Information Engineers 33 (2), 57-65, 2020-02-15
THE INSTITUTE OF SYSTEMS, CONTROL AND INFORMATION ENGINEERS (ISCIE)
- Tweet
Details 詳細情報について
-
- CRID
- 1390285300157940864
-
- NII Article ID
- 130007843319
-
- NII Book ID
- AN1013280X
-
- ISSN
- 2185811X
- 13425668
-
- NDL BIB ID
- 030247159
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed