荷重報酬和モデルで表されるタスク族に対する一括強化学習法 [in Japanese] Parallel Reinforcement Learning for Tasks with Weighted Sum of Partial Rewards [in Japanese]
Access this Article
Search this Article
Unlike ordinary reinforcement learning (RL) for a single task, RL for a family of tasks is desired in time-varying environments, multi-criteria problems, and inverse RL. In the present paper, a family of tasks is defined as weighted sum of partial rewards, and a parallel learning method is proposed for this family. Expected reward of the optimal policy is not linear in this case; it is a piecewise-linear convex function of weight values. Calculation of convex hulls and Minkowski sums realizes parallel Q-learning for all possible weight values at once, in spite of their infinite variations.
- The Brain & Neural Networks
The Brain & Neural Networks 13(4), 137-145, 2006-12-05
Japanese Neural Network Society