Search Results 1-20 of 30

  • 1 / 2
  • Compliant Locomotion Control for a Quadruped Robot with Damper Coefficients Assigned by Reinforcement Learning  [in Japanese]

    Narikiyo Tatsuo , Matsumoto Taiga , Ugurlu Barkan , Kawanishi Michihiro

    … Due to this adaptive mechanism, dynamically balanced jumping motion and trotting quadruped locomotion on the rough terrain can be realized. …

    Journal of the Robotics Society of Japan 35(5), 414-423, 2017


  • Incremental Natural Actor Critic with Importance Weight Aware Update  [in Japanese]

    岩城 諒 , 横山 裕樹 , 浅田 稔

    電子情報通信学会技術研究報告 = IEICE technical report : 信学技報 116(300), 251-257, 2016-11-16

  • Adaptive Critic Design with Local Gaussian Process Models

    Wang Wei , Chen Xin , He Jianxin

    … <p>In this paper, local Gaussian process (GP) approximation is introduced to build the critic network of adaptive dynamic programming (ADP). … With the two-phase value iteration method for a Gaussian-kernel (GK)-based critic network which realizes the update of the hyper-parameters and value functions simultaneously, fast value function approximation can be achieved. … Combining this critic network with an actor network, we present a local GK-based ADP approach. …

    Journal of Advanced Computational Intelligence and Intelligent Informatics 20(7), 1135-1140, 2016


  • A Robust Cooperated Control Method with Reinforcement Learning and Adaptive H_∞ Control  [in Japanese]

    OBAYASHI Masanao , UCHIYAMA Shogo , KUREMOTO Takashi , KOBAYASHI Kunikazu

    … We employ both the actor-critic method which is a kind of reinforcement learning with minimal amount of computation to control continuous valued actions and the traditional robust control, that is, <i>H</i><sub>∞</sub> … The proposed system was compared method with the conventional control method, that is, the actor-critic only used, through the computer simulation of controlling the angle and the position of a crane system, and the simulation result showed the effectiveness of the proposed method. …

    IEEJ Transactions on Electronics, Information and Systems 131(8), 1467-1474, 2011-08-01

    J-STAGE  References (11) Cited by (1)

  • Design of Autonomous Trading Agent using Natural Actor-Critic Method to Realize a Locally Produced and Consumed Electric Energy Network  [in Japanese]

    TANIGUCHI Tadahiro , TAKAGI Keita , SAKAKIBARA Kazutoshi , NISHIKAWA Ikuko

    … はそれらの問題に強いとされる方策勾配法,特にその一種である Natural Actor-Critic を用いて適応的取引エージェントを構築する.また,提案手法の有効性を示すために,6個のミニマル・クラスターにより構成されるローカルクラスターを対象にシミュレーション実験を行った.シミュレーション実験では,Natural Actor-Criticによりエージェントが適切な取引を学習する事が出来る事が示されたのと同 …

    Journal of Japan Society for Fuzzy Theory and Intelligent Informatics 21(6), 1078-1091, 2009-12-15

    J-STAGE  References (28)

  • Design of autonomous trading agent using natural actor-critic method to realize a locally produced and consumed electric energy network  [in Japanese]

    Taniguchi Tadahiro , Takagi Keita , Sakakibara Kazutoshi , Nishikawa Ikuko

    … 自動化の為の機構について検討する.電力売買を行うエージェントの学習に強化学習を用いる事で電力ロスを低減し,収益を最大化させるような適応的取引エージェントの構築を目指し,本稿ではNatural Actor-Criticを用いて適応的取引エージェントを構築する.また,提案手法の有効性を示すために,6個のミニマル・クラスターにより構成されるローカルクラスターを対象にシミュレーション実験を行いマルチエージェ …

    Proceedings of the Fuzzy System Symposium 25(0), 229-229, 2009


  • Acquiring vermicular motion of a Looper-like robot based on the CPG-Actor-Critic method  [in Japanese]

    MAKINO Kenji , NAKAMURA Yutaka , SHIBATA Tomohiro , ISHII Shin

    … で,高い環境適応能力を持つ潜在能力はあるものの,多自由度制御器の学習は一般に困難である.本研究では,我々が開発した多自由度ロボットであるミミズ型ロボットに,CPG-Actor-Critic法という強化学習法を適用した.CPG-Actor-Critic手法によりミミズ型ロボットが環境に適応し,ミミズ型ロボットの運動がアクチュエータの性能に適応した運動,及びアクチュエークの部分的な故障に適応した前進運動を獲得で …

    IEICE technical report 106(588), 203-208, 2007-03-07

    References (16)

  • Learning of a robust controller for a biped robot based on a sample-reuse reinforcement learning method  [in Japanese]

    UENO Tsuyoshi , NAKAMURA Yutaka , TAKUMA Takashi , SHIBATA Tomohiro , HOSODA Koh , ISHII Shin

    … と,学習速度が遅いためロボットが適切な制御器を獲得する前に故障してしまう可能性がある.本研究では,学習を加速するために,過去の制御器で獲得したサンプルを再利用することが可能なoff-policy Natural Actor-Critic法(off-NAC法)を採用し,準受動歩行の安定した制御器の獲得問題に適用する.本研究では,さらに,学習係数を適応的に調節する手法も提案する.本手法により,シミュレーション実験,実機実験の両方で安定かつ高速 …

    IEICE technical report 106(588), 197-202, 2007-03-07

    References (12)

  • A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems

    PADHI Radhakant , UNNIKRISHNAN Nishant , WANG Xiaohua , BALAKRISHNAN S. N.

    Neural networks : the official journal of the International Neural Network Society 19(10), 1648-1660, 2006-12-01

    References (24)

  • Proper orthogonal decomposition based optimal neurocontrol synthesis of a chemical reactor process using approximate dynamic programming

    PADHI Radhakant , BALAKRISHNAN S. N.

    Neural Networks 16(5), 719-728, 2003-06-01

    References (23) Cited by (1)

  • Adaptive Actor-Critic Control with a Model-based Actor and Multi-Step Simulated Experiences

    SYAM Rafiuddin , WATANABE Keigo , IZUMI Kiyotaka , KIGUCHI Kazuo

    … This paper describes a multi-step prediction of actor critic method, as a kind of temporal difference (TD) algorithm. … The effectiveness of the method for adaptive control is tested on a nonholonomic mobile robot by some simulations. …

    インテリジェント・システム・シンポジウム講演論文集 = FAN Symposium : fuzzy, artificial intelligence, neural networks and computational intelligence 12, 33-38, 2002-11-14

    NDL Digital Collections  References (11)

  • Reinforcement Learning Using Adaptive Search Method  [in Japanese]

    UMESAKO Kosuke , OBAYASHI Masanao , KOBAYASHI Kunikazu

    The Transactions of the Institute of Electrical Engineers of Japan. C 122(3), 374-380, 2002-03-01

    References (11) Cited by (1)

  • Control of Nonholonomic Mobile Robot by an Adaptive Actor-Critic Method with Simulated Experience Based Value-Functions

    SYAM R.

    Proc. of 2002 IEEE Int. Conf. on Robotics and Automation (ICRA2002), Washington D. C., May, 2002

    Cited by (2)

  • Robot Control by an Adaptive Actor-Critic Method Based on Multi-Step Simulated Experiences

    Watanabe Keigo , Syam Rafiuddin , Izumi Kiyotaka , Kiguchi Kazuo

    … An adaptive actor-critic method with multi-step simulated experiences is described as a kind of temporal difference (TD) method. … The value-function is generated from the critic formulated by a radial basis function neural network (RBFNN), which has a simulated experience as an input, generated from a predictive model based on a kinematic model. …

    SICE Division Conference Program and Abstracts ssi02(0), 73-73, 2002


  • Reinforcement Learning Using Adaptive Search Method

    Umesako Kosuke , Obayashi Masanao , Kobayashi Kunikazu

    … We propose an adaptive probability density function (PDF) to select an effective action on reinforcement learning (RL). … Furthermore, the proposed method can be applied easily to various methods of RL, for example, actor-critic, stochastic gradient ascent method. …

    IEEJ Transactions on Electronics, Information and Systems 122(3), 374-380, 2002


  • A Multi-agent Reinforcement Learning Method Based on the Model Inference of the Other Agents  [in Japanese]

    MATSUNO Yoichiro , YAMAZAKI Tatsuya , MATSUDA Jun , ISHII Shin

    … 本論文では, マルチエージェント系の一例としてカードゲームであるHeartsを取り上げ, そこでのエージェントの行動学習として, Actor-Critic型強化学習アルゴリズムと相手モデル学習を組み合わせたモデルを提案する. …

    The Transactions of the Institute of Electronics,Information and Communication Engineers. 00084(00008), 1150-1159, 2001-08-01

    References (15) Cited by (2)

  • A multi-agent reinforcement learning method with learning of other agents for competitive game  [in Japanese]

    MATSUNO Yoichiro , YAMAZAKI Tatsuya , MATSUDA Jun , ISHII Shin

    … この際, Criticによって近似された状態評価関数と相手戦略から推定した状態遷移確率を用いて期待TD誤差を計算することで部分観測性に対処する. …

    IEICE technical report. Neurocomputing 100(688), 91-98, 2001-03-16

    References (11)

  • A Behavior Acquisition for a Mobile Robot using View Information

    Kawabata Kuniaki , Ishikawa Tatsuya , Fujii Teruo , Asama Hajime , Endo Isao

    … For training behavior generation network, we utilize Actor-Critic method as a sort of unsupervised learning scheme. … As the result, the mobile robot generates adaptive behaviors utilizing visual information, autonomously. …

    IEEJ Transactions on Electronics, Information and Systems 121(4), 762-768, 2001


  • 1 / 2
Page Top