## A Heuristic Variable Grid Solution Method for POMDPs (1997)

Venue: | In AAAI |

Citations: | 56 - 1 self |

### BibTeX

@INPROCEEDINGS{Brafman97aheuristic,

author = {Ronen I. Brafman},

title = {A Heuristic Variable Grid Solution Method for POMDPs},

booktitle = {In AAAI},

year = {1997},

pages = {727--733}

}

### Years of Citing Articles

### OpenURL

### Abstract

Partially observable Markov decision processes (POMDPs) are an appealing tool for modeling planning problems under uncertainty. They incorporate stochastic action and sensor descriptions and easily capture goal oriented and process oriented tasks. Unfortunately, POMDPs are very difficult to solve. Exact methods cannot handle problems with much more than 10 states, so approximate methods must be used. In this paper, we describe a simple variable-grid solution method which yields good results on relatively large problems with modest computational effort. Introduction Markov decision processes (MDPs) (Bellman 1962) provide a mathematically elegant model of planning problems where actions have stochastic effects and tasks can be process oriented or have a more complex, graded notion of goal state. Partially observable MDPs (POMDPs) enhance this model, allowing for noisy and imperfect sensing, as well. Unfortunately, solving POMDPs, i.e., obtaining the optimal prescription for action choic...

### Citations

2626 |
Dynamic Programming
- Bellman
- 1957
(Show Context)
Citation Context ... this paper, we describe a simple variable-grid solution method which yields good results on relatively large problems with modest computational effort. Introduction Markov decision processes (MDPs) (=-=Bellman 1962-=-) provide a mathematically elegant model of planning problems where actions have stochastic effects and tasks can be process oriented or have a more complex, graded notion of goal state. Partially obs... |

832 | Planning and Acting in Partially Observable Stochastic Domains - Kaelbling, Littman, et al. - 1998 |

339 |
The Optimal Control of Partially Observable Markov Decision Processes
- Sondik
- 1971
(Show Context)
Citation Context ...ions for realistic POMDPs. In this paper, we describe a variable grid algorithm for obtaining approximate solutions to POMDPs in the infinite horizon case, which shows some promise. It is well known (=-=Sondik 1978-=-) that an optimal policy for a POMDP with n states can be obtained by solving the belief-space MDP whose state space consists of all probability distributions over the state space of the POMDP, i.e., ... |

233 | L.P.: Learning policies for partially observable environments: Scaling up - Littman, Cassandra, et al. - 1995 |

183 | Acting under uncertainty: discrete bayesian models for mobile-robotnavigation - Cassandra, Kaelbling, et al. - 1996 |

119 | Approximationg optimal policies for partially observable stochastic domains
- Parr, Russell
- 1995
(Show Context)
Citation Context ...Cassandra, & Kaelbling 1995a). Consequently, attempts have been made to come up with methods for obtaining approximately optimal policies (e.g., (Lovejoy 1991a; Littman, Cassandra, & Kaelbling 1995a; =-=Parr & Russell 1995-=-)). It is hoped that a combination of good approximation algorithms, structured problems, and clever encodings will lead to acceptable solutions for realistic POMDPs. In this paper, we describe a vari... |

74 |
Algorithms for Partially Observable Markov Decision Processes
- Cheng
- 1988
(Show Context)
Citation Context ...entrate our effort where it is most needed and to obtain good approximations with a much smaller grid. However, interpolation is harder to perform, and some variable grid construction methods (e.g., (=-=Cheng 1988-=-)) are quite complex, requiring considerable computation time and space. In this paper, we describe a simple variable grid method that performs well on a number of test problems from (Littman, Cassand... |

2 |
Computationally feasible bounds for pomdps
- Lovejoy
- 1991
(Show Context)
Citation Context ...t a realistic alternative in larger domains (Littman, Cassandra, & Kaelbling 1995a). Consequently, attempts have been made to come up with methods for obtaining approximately optimal policies (e.g., (=-=Lovejoy 1991-=-a; Littman, Cassandra, & Kaelbling 1995a; Parr & Russell 1995)). It is hoped that a combination of good approximation algorithms, structured problems, and clever encodings will lead to acceptable solu... |

2 |
Information seeking in markov decision processes
- Sondik, Mendelssohn
- 1979
(Show Context)
Citation Context ...set of belief states reachable from the initial belief state. If this set of states is small enough, it would be an ideal candidate for the set of grid points. This is essentially the method used in (=-=Sondik & Mendelssohn 1979-=-) for solving small POMDPs. Unfortunately, when the POMDP is large and there is sufficient uncertainty in actions' effects, the set of reachable states is very large. In that case, methods must be dev... |

1 |
A course in Triangulations for Solving Differential Equations with Deformations
- Eaves
- 1984
(Show Context)
Citation Context ...rywhere. Moreover, if the estimate on grid points is a lower bound, the value function obtained is a lower bound for the entire space. For the upper bound, Lovejoy uses the Freudenthal triangulation (=-=Eaves 1984-=-) to construct a particular grid which partitions the belief space, the simplex of order jSj, into sub-simplices. Using dynamic programming updates, the value function is estimated on the grid points.... |

1 |
A survey of algorithmic techniques for pomdps
- Lovejoy
- 1991
(Show Context)
Citation Context ...t a realistic alternative in larger domains (Littman, Cassandra, & Kaelbling 1995a). Consequently, attempts have been made to come up with methods for obtaining approximately optimal policies (e.g., (=-=Lovejoy 1991-=-a; Littman, Cassandra, & Kaelbling 1995a; Parr & Russell 1995)). It is hoped that a combination of good approximation algorithms, structured problems, and clever encodings will lead to acceptable solu... |