Decision-Theoretic Planning: Structural Assumptions and Computational Leverage (1999)

by Craig Boutilier , Thomas Dean , Steve Hanks
Venue:JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
Citations:417 - 4 self

Documents Related by Co-Citation

3760 Reinforcement Learning I: Introduction – Richard S. Sutton, Andrew G. Barto - 1998
2593 On the theory of dynamic programming – Richard E Bellman - 1952
367 Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition – Thomas G. Dietterich - 2000
334 The Optimal Control of Partially Observable Markov Processes – E J Sondik - 1971
171 A sparse sampling algorithm for near-optimal planning in large Markov decision processes – Michael Kearns - 1999
1202 Markov Decision Processes: Discrete Stochastic Dynamic Programming – M L Puterman - 1994
93 Solving POMDPs by Searching in Policy Space – Eric A. Hansen - 1998
157 Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes – Anthony Cassandra, Michael L. Littman, Nevin L. Zhang - 1997
114 Reinforcement Learning Methods for Continuous-Time Markov Decision Problems – Steven J. Bradtke, Michael O. Duff - 1994
513 Dynamic Programming and Markov Processes – R A Howard - 1960
226 Exploiting structure in policy construction – Craig Boutilier, Richard Dearden, MoisĂ©s Goldszmidt - 1995
822 Planning and acting in partially observable stochastic domains – Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra - 1998
306 The complexity of Markov decision processes – C Papadimitriou, J Tsisiklis - 1987
30 Incremental modelbased learners with formal learning-time guarantees – Alexander L. Strehl - 2006
30 R-max-a general polynomial time algorithm for near-optimal reinforcement learning – R I Brafman, M Tennenholtz - 2003
95 Markov Decision Processes—Discrete Stochastic Dynamic Programming – M L Puterman - 1994
461 A model for reasoning about persistence and causation – T Dean, K Kanazawa - 1989
236 R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning – Ronen I. Brafman, Moshe Tennenholtz, Pack Kaelbling - 2001
237 Near-optimal reinforcement learning in polynomial time – Michael Kearns - 1998