MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

The MAXQ Method for Hierarchical Reinforcement Learning (1998) [77 citations — 2 self]

by Thomas G. Dietterich
In Proceedings of the Fifteenth International Conference on Machine Learning
Add To MetaCart

Abstract:

This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural semantics---as a subroutine hierarchy---and a declarative semantics---as a representation of the value function of a hierarchical policy. MAXQ unifies and extends previous work on hierarchical reinforcement learning by Singh, Kaelbling, and Dayan and Hinton. Conditions under which the MAXQ decomposition can represent the optimal value function are derived. The paper defines a hierarchical Q learning algorithm, proves its convergence, and shows experimentally that it can learn much faster than ordinary "flat" Q learning. Finally, the paper discusses some interesting issues that arise in hierarchical reinforcement learning including the hierarchical credit assignment problem and non-hierarchical execution of the MAXQ hierarchy. 1 Introduction Hierarchical approaches to reinforcement learning (RL) problems promise ma...

Citations

222 Dynamic Programming and Optimal Control. Athena Scienti c – Bertsekas - 1995
187 Feudal Reinforcement Learning – Dayan, Hinton - 1993
173 Reinforcement Learning with Hierarchies of Machines – Parr, Russell - 1998
165 On the convergence of stochastic iterative dynamic programming algorithms – Jaakkola, Jordan, et al. - 1994
142 Transfer of learning by composing solutions for elemental sequential tasks – Singh - 1992
93 Decomposition techniques for planning in stochastic domains – Dean - 1995
74 Incremental multi-step Q-learning – Peng, Williams - 1994
61 Convergence results for single-step on-policy reinforcement-learning algorithms – Singh, Jaakkola, et al.
20 Hierarchical reinforcement learning: Preliminary results – Kaelbling - 1993