An Analysis of Actor-Critic Algorithms Using Eligibility Traces : Reinforcement Learning with Imperfect Value Functions

KIMURA Hajime, KOBAYASHI Shigenobu

doi:10.11517/jjsai.15.2_267

Bibliographic Information

Other Title

Actorに適正度の履歴を用いたActor-Criticアルゴリズム : 不完全なValue-Functionのもとでの強化学習
Actor ニテキセイドノリレキオモチイタ Actor Critic アルゴリズムフカンゼンナ Value Function ノモトデノキョウカガクシュウ

Search this article

Abstract

<p>We present an analysis of actor-critic algorithms, in which the actor updates its policy using eligibility traces of the policy parameters. Most of the theoretical results for eligibility traces have been for only critic's value iteration algorithms. This paper investigates what the actor's eligibility trace does. The results show that the algorithm is an extension of Williams' REINFORCE algorithms for infinite horizon reinforcement tasks, and then the critic provides an appropriate reinforcement baseline for the actor. Thanks to the actor's eligibility trace, the actor improves its policy by using a gradient of actual return, not by using a gradient of the estimated return in the critic. It enables the agent to learn a fairly good policy under the condition that the approximated value function in the critic is hopelessly inaccurate for conventional actor-critic algorithms. Also, if an accurate value function is estimated by the critic, the actor's learning is dramatically accelerated in our test cases. The behavior of the algorithm is demonstrated through simulations of a linear quadratic control problem and a pole balancing problem.</p>

Journal

Journal of the Japanese Society for Artificial Intelligence

Journal of the Japanese Society for Artificial Intelligence 15 (2), 267-275, 2000-03-01

The Japanese Society for Artificial Intelligence

Keywords

Details 詳細情報について

CRID: 1390848647556017024

NII Article ID: 110002808264

NII Book ID: AN10067140

ISSN: 09128085; 24358614; 21882266

DOI: 10.11517/jjsai.15.2_267

NDL BIB ID: 5297968

Web Site: https://ndlsearch.ndl.go.jp/books/R000000004-I5297968

Text Lang: ja

Data Source

JaLC
NDL
CiNii Articles

Abstract License Flag: Disallowed

Export

An Analysis of Actor-Critic Algorithms Using Eligibility Traces : Reinforcement Learning with Imperfect Value Functions

Bibliographic Information

Search this article

Abstract

Journal

Citations (27)*help

References(26)*help

Keywords

Details 詳細情報について

Export

Report a problem

An Analysis of Actor-Critic Algorithms Using Eligibility Traces : Reinforcement Learning with Imperfect Value Functions

Bibliographic Information

Search this article

Abstract

Journal

Citations (27)*help

References(26)*help

Keywords

Details 詳細情報について

Export

Report a problem

Project list