Residual Algorithms: Reinforcement Learning with Function Approximation (1995)

Cached

Download Links

by Leemon Baird
Venue:In Proceedings of the Twelfth International Conference on Machine Learning
Citations:237 - 5 self

Documents Related by Co-Citation

1226 Learning to predict by the methods of temporal differences – Richard S. Sutton - 1988
208 Stable Function Approximation in Dynamic Programming – Geoffrey J. Gordon - 1995
251 Generalization in Reinforcement Learning: Safely Approximating the Value Function – Justin A. Boyan, Andrew W. Moore - 1995
1309 Learning from Delayed Rewards – C Watkins - 1989
278 Improving Elevator Performance Using Reinforcement Learning – Robert Crites, Andrew Barto - 1996
455 Dynamic programming and optimal control. Athena Scientific – D Bertsekas - 2001
2593 On the theory of dynamic programming – Richard E Bellman - 1952
151 Asynchronous Stochastic Approximation and Q-Learning – John N. Tsitsiklis, Richard Sutton - 1994
513 Dynamic Programming and Markov Processes – R A Howard - 1960
207 Convergence of Stochastic Iterative Dynamic Programming Algorithms – Tommi Jaakkola, Michael I. Jordan, Satinder P. Singh - 1994
527 Learning to act using real-time dynamic programming – Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh - 1993
1298 Reinforcement learning: a survey – Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - 1996
127 Gradient Descent for General Reinforcement Learning – Leemon Baird, Andrew Moore - 1998
274 Reinforcement Learning with Selective Perception and Hidden State – A McCallum - 1995
3760 Reinforcement Learning I: Introduction – Richard S. Sutton, Andrew G. Barto - 1998
226 Exploiting structure in policy construction – Craig Boutilier, Richard Dearden, Mois├ęs Goldszmidt - 1995
111 Reinforcement Learning with Soft State Aggregation – Satinder P. Singh, Tommi Jaakkola, Michael I. Jordan - 1995
355 Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding – Richard S. Sutton - 1996
187 Reinforcement learning for robots using neural networks – L-J Lin - 1992