Least Squares Policy Evaluation Algorithms With Linear Function Approximation (2002)

by A. Nedic , D. P. Bertsekas
Venue:Theory and Applications
Citations:64 - 9 self

Active Bibliography

26 Improved Temporal Difference Methods with Linear Function Approximation – Dimitri P. Bertsekas, Angelia Nedich, Vivek S. Borkar
Preconditioned Temporal Difference Learning – unknown authors - 704
8 Performance loss bounds for approximate value iteration with state aggregation – Benjamin Van Roy - 2005
36 A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference – David Choi, Benjamin Van Roy - 2001
18 Convergence Results for Some Temporal Difference Methods Based on Least Squares,” Lab. for Information and Decision Systems Report 2697 – Huizhen Yu, Dimitri P. Bertsekas - 2006
3 Title of the Book! – Name Of Author
8 Gradient Temporal-Difference Learning Algorithms – Hamid Reza Maei - 2011
62 Automatic basis function construction for approximate dynamic programming and reinforcement learning – Shie Mannor, Doina Precup - 2006
5 Q-learning and enhanced policy iteration in discounted dynamic programming – Dimitri P. Bertsekas, Huizhen Yu - 2012
© 2012 INFORMS Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming – Dimitri P. Bertsekas, Huizhen Yu - 2012
Synthesis Lectures on Artificial Intelligence and Machine Learning – Csaba Szepesvári - 2009
3 Reinforcement learning algorithms for MDPs – Csaba Szepesvári - 2009
63 Algorithms for Reinforcement Learning – Csaba Szepesvári - 2009
ARTICLE IN PRESS Journal of Computational and Applied Mathematics ( ) – Contents lists available at ScienceDirect Journal of Computational and Applied – unknown authors
Contents lists available at ScienceDirect Journal of Computational and Applied – unknown authors
An Algorithmic Survey of Parametric Value Function Approximation – Matthieu Geist, Olivier Pietquin
7 Projected Equations, Variational Inequalities, and Temporal Difference Methods – Dimitri P. Bertsekas - 2009
5 Temporal Difference Methods for General Projected Equations – Dimitri P. Bertsekas
56 Basis function adaptation in temporal difference reinforcement learning – Ishai Menache, Shie Mannor, Nahum Shimkin - 2005