Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions (1993)

by Ronald Williams , Leemon C. Baird
Citations:84 - 1 self

Active Bibliography

28 Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding Actor-Critic Learning Systems – Ronald J. Williams, Leemon C. Baird, III - 1993
1309 Reinforcement learning: a survey – Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - 1996
1 A Study on Architecture, Algorithms, and Applications of Approximate Dynamic Programming Based Approach to Optimal Control – Jong Min Lee - 2004
177 Algorithms for Sequential Decision Making – Michael Lederman Littman - 1996
1 C3 Reinforcement Learning – S. Sathiya Keerthi, B. Ravindran
10 A Tutorial Survey of Reinforcement Learning – S Sathiya Keerthi, B Ravindran
94 Efficient Learning and Planning Within the Dyna Framework – Jing Peng, Ronald J. Williams - 1993
532 Learning to act using real-time dynamic programming – Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh - 1993
48 Learning to Solve Markovian Decision Processes – Satinder P. Singh - 1994
Solution of Delayed Reinforcement Learning Problems Having Continuous Action Spaces – B. Ravindran - 1996
ATutorial Survey of Reinforcement Learning – Ssathiya Keerthi, B Ravindran
89 Incremental Multi-Step Q-Learning – Jing Peng, Ronald J. Williams - 1996
10 Modular On-line Function Approximation for Scaling up Reinforcement Learning – Chen Khong Tham - 1994
8 The Sensorimotor Foundations of Phonology: A Computational Model of Early Childhood Articulatory and Phonetic Development – Kevin Lee Markey - 1994
160 Locally Weighted Learning for Control – Christopher G. Atkeson, Andrew W. Moore, Stefan Schaal - 1996
47 Problem Solving With Reinforcement Learning – Gavin Adrian Rummery - 1995
Chapter 2 Reinforcement Learning – unknown authors
20 Incremental Dynamic Programming for On-Line Adaptive Optimal Control – Steven J. Bradtke - 1994
4 The interaction of representations and planning objectives for decision-theoretic planning tasks – Sven Koenig, Yaxin Liu - 2002