Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning (1999)

Cached

Download Links

by Richard Sutton , Doina Precup , Satinder Singh
Venue:Artificial Intelligence
Citations:426 - 29 self

Documents Related by Co-Citation

3760 Reinforcement Learning I: Introduction – Richard S. Sutton, Andrew G. Barto - 1998
240 Reinforcement learning with hierarchies of machines – Ronald Parr, Stuart Russell - 1998
367 Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition – Thomas G. Dietterich - 2000
275 Self-improving reactive agents based on reinforcement learning, planning and teaching – Long-ji Lin - 1992
108 Hierarchical Control and Learning for Markov Decision Processes – Ronald Edward Parr - 1998
285 On-Line Q-Learning Using Connectionist Systems – G. A. Rummery, M. Niranjan - 1994
122 The MAXQ Method for Hierarchical Reinforcement Learning – Thomas G. Dietterich - 1998
1298 Reinforcement learning: a survey – Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - 1996
9 Temporal abstraction in reinforcement learning. Doctoral dissertation – D Precup - 2000
81 Discovering hierarchy in reinforcement learning with hexq – Bernhard Hengst - 2002
102 Programmable reinforcement learning agents – David Andre, Stuart J. Russell - 2001
114 Reinforcement Learning Methods for Continuous-Time Markov Decision Problems – Steven J. Bradtke, Michael O. Duff - 1994
278 Improving Elevator Performance Using Reinforcement Learning – Robert Crites, Andrew Barto - 1996
224 TD-gammon, a self-teaching backgammon program, achieves master-level play – G J Tesauro - 1994
207 Convergence of Stochastic Iterative Dynamic Programming Algorithms – Tommi Jaakkola, Michael I. Jordan, Satinder P. Singh - 1994
46 Learning hierarchical control structures for multiple tasks and changing environments – B Digney - 1998
118 On representations of problems of reasoning about actions – S Amarel - 1968
174 Policy invariance under reward transformations: Theory and application to reward shaping – Andrew Y. Ng, Daishi Harada, Stuart Russell - 1999
46 Autonomous Discovery Of Temporal Abstractions From Interaction With An Environment – Elizabeth Amy Mcgovern, Neil E. Berthier, Roderic A. Grupen, J. Eliot, B. Moss, Elizabeth Amy Mcgovern, W. Bruce Croft, Department Chair - 2002