Practical Issues in Temporal Difference Learning (1992)

by Gerald Tesauro
Venue:Machine Learning
Citations:363 - 2 self

Documents Related by Co-Citation

1226 Learning to predict by the methods of temporal differences – Richard S. Sutton - 1988
1309 Learning from Delayed Rewards – C Watkins - 1989
207 Convergence of Stochastic Iterative Dynamic Programming Algorithms – Tommi Jaakkola, Michael I. Jordan, Satinder P. Singh - 1994
62 Practical Issues in Temporal Di erence Learning – G Tesauro - 1992
612 Some studies in machine learning using the game of Checkers – Arthur L. Samuel - 1959
61 The convergence of td() for general – Peter Dayan - 1992
151 Asynchronous Stochastic Approximation and Q-Learning – John N. Tsitsiklis, Richard Sutton - 1994
334 Automatic Programming of Behavior-based Robots using Reinforcement Learning – S. Mahadevan, J. Connell, C. Sammut, R. Sutton, Temporal Phd - 1991
278 Improving Elevator Performance Using Reinforcement Learning – Robert Crites, Andrew Barto - 1996
373 Dynamic Programming: Deterministic and Stochastic Model – D P Bertsekas - 1987
153 Learning to predict by the methods of temporal di↵erences – R S Sutton - 1988
242 Temporal credit assignment in reinforcement learning. Doctoral dissertation – R S Sutton - 1984
2593 On the theory of dynamic programming – Richard E Bellman - 1952
55 Reinforcement Learning Applied to Linear Quadratic Regulation – Steven J. Bradtke - 1993
175 Neuron-like elements that can solve difficult learning control problems – A Barto, R Sutton, C Anderson - 1983
251 Generalization in Reinforcement Learning: Safely Approximating the Value Function – Justin A. Boyan, Andrew W. Moore - 1995
208 Stable Function Approximation in Dynamic Programming – Geoffrey J. Gordon - 1995
55 TD(λ) Converges with Probability 1 – Peter Dayan, Terrence J. Sejnowski - 1994
527 Learning to act using real-time dynamic programming – Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh - 1993