Acceleration of reinforcement learning by policy evaluation using nonstationary iterative method.

HANDLE Open Access

Abstract

Typical methods for solving reinforcement learning problems iterate two steps, policy evaluation and policy improvement. This paper proposes algorithms for the policy evaluation to improve learning efficiency. The proposed algorithms are based on the Krylov Subspace Method (KSM), which is a nonstationary iterative method. The algorithms based on KSM are tens to hundreds times more efficient than existing algorithms based on the stationary iterative methods. Algorithms based on KSM are far more efficient than they have been generally expected. This paper clarifies what makes algorithms based on KSM makes more efficient with numerical examples and theoretical discussions.

Journal

Details 詳細情報について

  • CRID
    1050001335802359680
  • NII Article ID
    120005522521
  • ISSN
    21682275
  • HANDLE
    2433/192769
  • Text Lang
    en
  • Article Type
    journal article
  • Data Source
    • IRDB
    • CiNii Articles

Report a problem

Back to top