深層学習とプレイアウトに基づく囲碁アルゴリズム

機関リポジトリ オープンアクセス

書誌事項

タイトル別名
  • シンソウ ガクシュウ ト プレイ アウト ニ モトズク イゴ アルゴリズム
  • Go Algorithm Based on Deep Learning and Playout

この論文をさがす

抄録

This paper describes a go algorithm based on deep learning and playout. The algorithm runs on a small resource environment which consists of one CPU and one GPU. The best next move can be obtained by using a Value-Monte-Carlo tree search method. It is one of the best-first search methods. The proposed method omits the process of tree policy which has been proposed by AlphaGo. Instead of tree policy, the method adds the top 20 candidates with the highest probability in synchronization with SL policy network as leaves of the node when expanding a leaf node. The win/loss function according to the rollout policy advocated by AlphaGo is substituted by playout, which is commonly used in ordinary Monte-Carlo tree search. As a node evaluation value, not an ordinary UCB1 value but an action value advocated by AlphaGo is adopted. Numerical experiments confirmed the statistical significance of the proposed method and clarified both the best mixing parameter value and the node expansion threshold.

identifier:http://repository.aitech.ac.jp/dspace/handle/11133/3491

収録刊行物

関連プロジェクト

もっと見る

詳細情報 詳細情報について

  • CRID
    1050282813204890624
  • NII論文ID
    120006607321
  • NII書誌ID
    AA12337561
  • ISSN
    18833217
  • Web Site
    http://hdl.handle.net/11133/3491
  • 本文言語コード
    ja
  • 資料種別
    departmental bulletin paper
  • データソース種別
    • IRDB
    • CiNii Articles
    • KAKEN

問題の指摘

ページトップへ