Prioritized sweeping: Reinforcement learning with less data and less time (1993)

by Andrew W. Moore , Christopher G. Atkeson
Venue:Machine Learning
Citations:336 - 5 self

Documents Related by Co-Citation

11073 Computers and Intractability: A Guide to the Theory of NP-Completeness – M R Garey, D S Johnson - 1979
2684 Dynamic Programming – Bellman - 1957
3892 Reinforcement Learning I: Introduction – Richard S. Sutton, Andrew G. Barto - 1998
1342 Learning from delayed rewards – C Watkins - 1989
1246 Learning to predict by the methods of temporal differences – Richard S. Sutton - 1988
486 Integrated architectures for learning, planning, and reacting based on approximating dynamic programming – Richard S. Sutton - 1990
540 Learning to act using real-time dynamic programming – Andrew G. Barto, Steven J. Bradtke, Satinder P. Singh - 1993
1324 Reinforcement learning: a survey – Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - 1996
531 Dynamic Programming and Markov Processes – R Howard - 1960