MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Planning and Acting in Partially Observable Stochastic Domains (1995) [453 citations — 15 self]

Abstract:

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm for solving pomdps off line and show how, in some cases, a finite-memory controller can be extracted from the solution to a pomdp. We conclude with a discussion of the complexity of finding exact solutions to pomdps and of some possibilities for finding approximate solutions. Consider the problem of a robot navigating in a large office building. The robot can move from hallway intersection to intersection and can make local observations of its world. Its actions are not completely reliable, however. Sometimes, when it intends to move, it stays where it is or goes too far; sometimes, when it intends to turn, it overshoots. It has similar problems with observation. Sometimes a corridor looks...

Citations

2372 A tutorial on hidden Markov Models and selected applications in speech recognition – Rabiner - 1989
1921 Genetic Programming I : On the Programming of Computers by Means of Natural Selection – Koza - 1992
986 Theory of Linear and Integer Programming – Schrijver - 1986
954 A new approach to linear filtering and prediction problems – Kalman - 1960
759 Fast planning through planning graph analysis – Blum, Furst - 1997
376 UCPOP: A sound, complete, partial order planner for adl – Penberthy, Weld - 1992
374 Markov Decision Processes – Puterman - 1994
359 Dynamic Programming and Markov Processes – Howard - 1960
353 Systematic nonlinear planning – McAllester, Rosenblitt - 1991
295 Universal plans for reactive robots in unpredictable environments – Schoppers - 1987
265 A formal theory of knowledge and action – Moore - 1985
224 An algorithm for probabilistic planning – Kushmerick, Hanks, et al. - 1995
222 Dynamic Programming and Optimal Control. Athena Scienti c – Bertsekas - 1995
221 The optimal Control of Partially Observable Markov processes – Sondik - 1971
209 Acting optimally in partially observable stochastic domains – Cassandra, Kaelbling - 1994
187 Probabilistic planning with information gathering and contingent execution – Draper, Hanks, et al. - 1994
186 Conditional non-linear planning – Peot, Smith - 1992
172 Kaelbling. Learning policies for partially observable environments: Scaling up – Littman, Cassandra, et al. - 1995
165 The optimal control of partially observable Markov decision processes over a finite horizon – Smallwood, Sondik - 1973
157 Reinforcement Learning with Perceptual Aliasing: The Predictive Distinctions Approach – Chrisman - 1992
141 A survey of algorithmic methods for partially observable Markov decision processes – Lovejoy - 1991
136 Planning under time constraints in stochastic domains – Dean, Kaelbling, et al. - 1995
134 A survey of partially observable Markov decision processes: Theory, models and algorithms, Management Science 28 – Monahan - 1982
130 Algorithms for sequential decision making – Littman - 1996
120 Hidden Markov Model induction by Bayesian model merging – Stolcke, Omohundro - 1993
114 Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes – Cassandra, Littman - 1997
100 Anytime synthetic projection: Maximizing probability of goal satisfaction – Drummond, Bresina - 1990
95 The complexity of stochastic games – Condon - 1992
89 Overcoming incomplete perception with utile distinction memory – McCallum - 1993
86 Utility models for goal-directed decision-theoretic planners – Haddawy, Hanks - 1993
81 Optimal control of Markov decision processes with incomplete state estimation – Aström - 1965
81 Exact and Approximate Algorithms for Partially Observable Markov Decision Processes – Cassandra - 1998
81 G.: Planning for contingencies: A decision-based approach – Pryor, Collins - 1996
80 The frame problem and knowledge-producing actions – Scherl, Levesque - 1993
79 Memoryless policies: Theoretical limitations and practical results – Littman - 1994
78 Computing optimal policies for partially observable decision processes using compact representations – Boutilier, Poole - 1996
78 Instance-based utile distinctions for reinforcement learning with hidden state – McCallum - 1995
74 Information value theory – Howard - 1966
62 Algorithms for partially observable Markov decision processes – Cheng - 1988
57 Maxplan: A new approach to probabilistic planning – Majercik, Littman - 1998
56 Knowledge preconditions for actions and plans – Morgenstern
54 Tight performance bounds on greedy policies based on imperfect value functions – Williams, Baird - 1993
52 Markov Decision Processes-Discrete Stochastic Dynamic Programming – Puterman - 1994
47 Planning with external events – Blythe - 1994
35 Conditional linear planning – Goldman, Boddy - 1994
34 Control strategies for a stochastic planner AAAI-94 – Tash, Russell
34 The complexity of mean payoff games on graphs – Zwick, Paterson - 1996
34 The witness algorithm: solving partially observable Markov decision processes – Littman - 1994
31 Epsilon-safe planning – Goldman, Boddy - 1994
25 Rewarding behaviors – Bacchus, Boutilier, et al. - 1996