Least Squares Policy Evaluation Algorithms With Linear Function Approximation (2002)

by A. Nedic , D. P. Bertsekas
Venue:Theory and Applications
Citations:63 - 9 self

Documents Related by Co-Citation

179 Linear least-squares algorithms for temporal difference learning – Steven J. Bradtke, Andrew G. Barto, Pack Kaelbling - 1996
1222 Learning to predict by the methods of temporal differences – Richard S. Sutton - 1988
174 Actor-Critic Algorithms – Vijay R. Konda, John N. Tsitsiklis - 2001
216 An analysis of temporal-difference learning with function approximation – John N. Tsitsiklis, Benjamin Van Roy - 1997
88 Technical update: Least-squares temporal difference learning – Justin A. Boyan - 2002
95 Least-Squares Temporal Difference Learning – Justin A. Boyan - 1999
40 Temporal differences-based policy iteration and applications in neuro-dynamic programming – Dimitri P. Bertsekas, Sergey Ioffe - 1996
75 Optimal Stopping of Markov Processes: Hilbert Space Theory, Approximation Algorithms, and an Application to Pricing High-Dimensional Financial Derivatives – John N. Tsitsiklis, Benjamin Van Roy - 1997
25 Improved Temporal Difference Methods with Linear Function Approximation – Dimitri P. Bertsekas, Angelia Nedich, Vivek S. Borkar
3746 Reinforcement Learning I: Introduction – Richard S. Sutton, Andrew G. Barto - 1998
317 Policy Gradient Methods for Reinforcement Learning with Function Approximation – Richard S. Sutton, David Mcallester, Satinder Singh, Yishay Mansour - 1999
737 Non-Linear Programming. Athena Scientific – D Bertsekas - 1995
24 On the existence of fixed points for approximate value iteration and temporal-difference learning – D P de Farias, B V Roy - 2000
17 A least squares Q-learning algorithm for optimal stopping problems – H Yu, D P Bertsekas - 2007
742 Neuro-Dynamic Programming – D Bertsekas, John N Tsitsiklis - 1996
53 Error Bounds for Approximate Policy Iteration – Rmi Munos
318 Simple statistical gradient-following algorithms for connectionist reinforcement learning – Ronald J. Williams - 1992
33 A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference – David Choi, Benjamin Van Roy - 2001
21 Projected equation methods for approximate solution of large linear systems – Dimitri P. Bertsekas, et al.