Application of recurrent neural networks to reinforcement learning under incomplete perception

この論文にアクセスする

この論文をさがす

NDL ONLINE

著者

- Ahmet Onat アーメトオナト

書誌事項

タイトル: Application of recurrent neural networks to reinforcement learning under incomplete perception

タイトル別名: リカレントニューラルネットワークの不完全知覚下での強化学習への応用

著者名: Ahmet Onat

著者別名: アーメトオナト

学位授与大学: 京都大学

取得学位: 博士 (工学)

学位授与番号: 甲第7842号

学位授与年月日: 1999-03-23

注記・抄録

博士論文

論文目録 / (0001.jp2)
Contents / p1 (0005.jp2)
1 Introduction / p1 (0009.jp2)
2 Reinforcement Learning / p5 (0011.jp2)
2.1 Overview / p5 (0011.jp2)
2.2 Background / p6 (0012.jp2)
2.3 Mathematical definition of Reinforcement Learning / p7 (0012.jp2)
2.4 Markovian decision processes / p11 (0014.jp2)
2.5 Dynamic programming / p12 (0015.jp2)
2.6 Exploration methods / p14 (0016.jp2)
2.7 Reinforcement learning algorithms / p16 (0017.jp2)
3 Reinforcement Learning under Incomplete Perception / p25 (0021.jp2)
3.1 Overview / p25 (0021.jp2)
3.2 Observation based solution methods / p27 (0022.jp2)
3.3 Solution methods with a dynamic environment model / p28 (0023.jp2)
4 Recurrent Neural Networks / p35 (0026.jp2)
4.1 Overview / p35 (0026.jp2)
4.2 The neuron model / p37 (0027.jp2)
4.3 Recurrent neural network architecture / p37 (0027.jp2)
4.4 Supervised training algorithms / p42 (0030.jp2)
5 Q-learning with Recurrent Neural Networks / p48 (0033.jp2)
5.1 Overview / p48 (0033.jp2)
5.2 Learning agent structure / p49 (0033.jp2)
5.3 The learning procedure / p54 (0036.jp2)
5.4 Implementation of the learning agent / p55 (0036.jp2)
5.5 Differences between the proposed structure and Recurrent-Q / p56 (0037.jp2)
6 Q-learning with Recurrent Neural Networks in Symbolic Environments / p58 (0038.jp2)
6.1 Overview / p58 (0038.jp2)
6.2 The symbolic environments / p59 (0038.jp2)
6.3 Results for the house environment / p63 (0040.jp2)
6.4 Propagation of the Q values / p86 (0052.jp2)
6.5 Summary / p90 (0054.jp2)
7 Q-learning with Recurrent Neural Networks in a Numeric Control Problem / p93 (0055.jp2)
7.1 Overview / p93 (0055.jp2)
7.2 The inverted pendulum problem / p94 (0056.jp2)
7.3 Results for controlling the inverted pendulum / p97 (0057.jp2)
7.4 Summary / p103 (0060.jp2)
8 Stochastic Gradient Ascent with Recurrent Neural Networks / p106 (0062.jp2)
8.1 Overview / p106 (0062.jp2)
8.2 The architecture / p108 (0063.jp2)
8.3 Details of the learning algorithm / p110 (0064.jp2)
8.4 The simulation environments / p112 (0065.jp2)
8.5 Results of simulations / p114 (0066.jp2)
8.6 Summary / p126 (0072.jp2)
9 Conclusion / p129 (0073.jp2)

Application of recurrent neural networks to reinforcement learning under incomplete perception リカレントニューラルネットワークの不完全知覚下での強化学習への応用

この論文にアクセスする

この論文をさがす

著者

書誌事項

注記・抄録

目次

各種コード

書き出し