Technical update: Least-squares temporal difference learning (2002)

by Justin A. Boyan
Venue:Machine Learning
Citations:87 - 2 self

Documents Related by Co-Citation

182 Linear least-squares algorithms for temporal difference learning – Steven J. Bradtke, Andrew G. Barto, Pack Kaelbling - 1996
1226 Learning to predict by the methods of temporal differences – Richard S. Sutton - 1988
3760 Reinforcement Learning I: Introduction – Richard S. Sutton, Andrew G. Barto - 1998
218 An analysis of temporal-difference learning with function approximation – John N. Tsitsiklis, Benjamin Van Roy - 1997
301 Least-Squares Policy Iteration – Michail G. Lagoudakis, Ronald Parr, L. Bartlett - 2003
174 Actor-Critic Algorithms – Vijay R. Konda, John N. Tsitsiklis - 2001
65 Least Squares Policy Evaluation Algorithms With Linear Function Approximation – A. Nedic, D. P. Bertsekas - 2002
237 Residual Algorithms: Reinforcement Learning with Function Approximation – Leemon Baird - 1995
74 Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path – András Antos, Csaba Szepesvári, Rémi Munos - 2008
25 Incremental least-squares temporal difference learning – Alborz Geramifard, Michael Bowling, Richard S. Sutton - 2006
187 Reinforcement learning for robots using neural networks – L-J Lin - 1992
38 The convergence of TD(λ) for general λ – P Dayan - 1992
741 Neuro-dynamic Programming. Athena Scientific – D P Bertsekas, J Tsitsiklis - 1996
208 Stable Function Approximation in Dynamic Programming – Geoffrey J. Gordon - 1995
471 Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems – A G Barto, R S Sutton, C W Anderson - 1983
1298 Reinforcement learning: a survey – Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - 1996
1309 Learning from Delayed Rewards – C Watkins - 1989
111 Reinforcement Learning with Soft State Aggregation – Satinder P. Singh, Tommi Jaakkola, Michael I. Jordan - 1995
19 On the Convergence of Temporal-Difference Learning with Linear Function Approximation – V Tadic - 2001