深層学習とプレイアウトに基づく囲碁アルゴリズム
書誌事項
- タイトル別名
-
- シンソウ ガクシュウ ト プレイ アウト ニ モトズク イゴ アルゴリズム
- Go Algorithm Based on Deep Learning and Playout
この論文をさがす
抄録
This paper describes a go algorithm based on deep learning and playout. The algorithm runs on a small resource environment which consists of one CPU and one GPU. The best next move can be obtained by using a Value-Monte-Carlo tree search method. It is one of the best-first search methods. The proposed method omits the process of tree policy which has been proposed by AlphaGo. Instead of tree policy, the method adds the top 20 candidates with the highest probability in synchronization with SL policy network as leaves of the node when expanding a leaf node. The win/loss function according to the rollout policy advocated by AlphaGo is substituted by playout, which is commonly used in ordinary Monte-Carlo tree search. As a node evaluation value, not an ordinary UCB1 value but an action value advocated by AlphaGo is adopted. Numerical experiments confirmed the statistical significance of the proposed method and clarified both the best mixing parameter value and the node expansion threshold.
identifier:http://repository.aitech.ac.jp/dspace/handle/11133/3491
収録刊行物
-
- 愛知工業大学研究報告 = Bulletin of Aichi Institute of Technology
-
愛知工業大学研究報告 = Bulletin of Aichi Institute of Technology 54 110-117, 2019-03-31
愛知工業大学
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1050282813204890624
-
- NII論文ID
- 120006607321
-
- NII書誌ID
- AA12337561
-
- ISSN
- 18833217
-
- Web Site
- http://hdl.handle.net/11133/3491
-
- 本文言語コード
- ja
-
- 資料種別
- departmental bulletin paper
-
- データソース種別
-
- IRDB
- CiNii Articles
- KAKEN