Results 1  10
of
233
DecisionTheoretic Planning: Structural Assumptions and Computational Leverage
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 1999
"... Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives ..."
Abstract

Cited by 515 (4 self)
 Add to MetaCart
Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives adopted in these areas often differ in substantial ways, many planning problems of interest to researchers in these fields can be modeled as Markov decision processes (MDPs) and analyzed using the techniques of decision theory. This paper presents an overview and synthesis of MDPrelated methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI. It also describes structural properties of MDPs that, when exhibited by particular classes of problems, can be exploited in the construction of optimal or approximately optimal policies or plans. Planning problems commonly possess structure in the reward and value functions used to de...
Efficient Solution Algorithms for Factored MDPs
, 2003
"... This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the re ..."
Abstract

Cited by 172 (3 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the representation size of structured MDPs, but the complexity of exact solution algorithms for such MDPs can grow exponentially in the representation size. In this paper, we present two approximate solution algorithms that exploit structure in factored MDPs. Both use an approximate value function represented as a linear combination of basis functions, where each basis function involves only a small subset of the domain variables. A key contribution of this paper is that it shows how the basic operations of both algorithms can be performed efficiently in closed form, by exploiting both additive and contextspecific structure in a factored MDP. A central element of our algorithms is a novel linear program decomposition technique, analogous to variable elimination in Bayesian networks, which reduces an exponentially large LP to a provably equivalent, polynomialsized one. One algorithm uses approximate linear programming, and the second approximate dynamic programming. Our dynamic programming algorithm is novel in that it uses an approximation based on maxnorm, a technique that more directly minimizes the terms that appear in error bounds for approximate MDP algorithms. We provide experimental results on problems with over 10^40 states, demonstrating a promising indication of the scalability of our approach, and compare our algorithm to an existing stateoftheart approach, showing, in some problems, exponential gains in computation time.
Symbolic Dynamic Programming for Firstorder MDPs
 In IJCAI
, 2001
"... We present a dynamic programming approach for the solution of firstorder Markov decisions processes. This technique uses an MDP whose dynamics is represented in a variant of the situation calculus allowing for stochastic actions. It produces a logical description of the optimal value function and p ..."
Abstract

Cited by 148 (4 self)
 Add to MetaCart
We present a dynamic programming approach for the solution of firstorder Markov decisions processes. This technique uses an MDP whose dynamics is represented in a variant of the situation calculus allowing for stochastic actions. It produces a logical description of the optimal value function and policy by constructing a set of firstorder formulae that minimally partition state space according to distinctions made by the value function and policy. This is achieved through the use of an operation known as decisiontheoretic regression. In effect, our algorithm performs value iteration without explicit enumeration of either the state or action spaces of the MDP. This allows problems involving relational fluents and quantification to be solved without requiring explicit state space enumeration or conversion to propositional form. 1
Heuristic search value iteration for pomdps
 In UAI
, 2004
"... We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret with respect to the optimal policy. HSVI gets its power by combining two wellknown techniques: attentionfocusing search ..."
Abstract

Cited by 137 (3 self)
 Add to MetaCart
We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret with respect to the optimal policy. HSVI gets its power by combining two wellknown techniques: attentionfocusing search heuristics and piecewise linear convex representations of the value function. HSVI’s soundness and convergence have been proven. On some benchmark problems from the literature, HSVI displays speedups of greater than 100 with respect to other stateoftheart POMDP value iteration algorithms. We also apply HSVI to a new rover exploration problem 10 times larger than most POMDP problems in the literature. 1
Equivalence notions and model minimization in Markov decision processes
, 2003
"... Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for solving them run in time polynomial in the size of the state space, where this size is extremely large for most realworld ..."
Abstract

Cited by 117 (2 self)
 Add to MetaCart
Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for solving them run in time polynomial in the size of the state space, where this size is extremely large for most realworld planning problems of interest. Recent AI research has addressed this problem by representing the MDP in a factored form. Factored MDPs, however, are not amenable to traditional solution methods that call for an explicit enumeration of the state space. One familiar way to solve MDP problems with very large state spaces is to form a reduced (or aggregated) MDP with the same properties as the original MDP by combining “equivalent ” states. In this paper, we discuss applying this approach to solving factored MDP problems—we avoid enumerating the state space by describing large blocks of “equivalent” states in factored form, with the block descriptions being inferred directly from the original factored representation. The resulting reduced MDP may have exponentially fewer states than the original factored MDP, and can then be solved using traditional methods. The reduced MDP found depends on the notion of equivalence between states used in the aggregation. The notion of equivalence chosen will be fundamental in designing and analyzing
Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes
, 2005
"... Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithm ..."
Abstract

Cited by 91 (6 self)
 Add to MetaCart
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithms, which can only solve problems with up to ten thousand states. In fact, the complexity of finding an optimal policy for a finitehorizon discrete POMDP is PSPACEcomplete. In practice, two important sources of intractability plague most solution algorithms: large policy spaces and large state spaces. On the other hand,
Valuedirected Compression of POMDPs
 In NIPS 15
, 2002
"... We examine the problem of generating statespace compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compressions on decision quality, observing that compressions that allow accurate policy evaluation (prediction of expected future reward) will not af ..."
Abstract

Cited by 72 (4 self)
 Add to MetaCart
(Show Context)
We examine the problem of generating statespace compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compressions on decision quality, observing that compressions that allow accurate policy evaluation (prediction of expected future reward) will not affect decision quality.
Implementation of Symbolic Model Checking for Probabilistic Systems
, 2002
"... In this thesis, we present ecient implementation techniques for probabilistic model checking, a method which can be used to analyse probabilistic systems such as randomised distributed algorithms, faulttolerant processes and communication networks. A probabilistic model checker inputs a probabilist ..."
Abstract

Cited by 72 (21 self)
 Add to MetaCart
(Show Context)
In this thesis, we present ecient implementation techniques for probabilistic model checking, a method which can be used to analyse probabilistic systems such as randomised distributed algorithms, faulttolerant processes and communication networks. A probabilistic model checker inputs a probabilistic model and a speci cation, such as \the message will be delivered with probability 1", \the probability of shutdown occurring is at most 0.02" or \the probability of a leader being elected within 5 rounds is at least 0.98", and can automatically verify if the speci cation is true in the model.
Contingent Planning Under Uncertainty via Stochastic Satisfiability
 Artificial Intelligence
, 1999
"... We describe two new probabilistic planning techniques cmaxplan and zanderthat generate contingent plans in probabilistic propositional domains. Both operate by transforming the planning problem into a stochastic satisfiability problem and solving that problem instead. cmaxplan encodes t ..."
Abstract

Cited by 71 (11 self)
 Add to MetaCart
(Show Context)
We describe two new probabilistic planning techniques cmaxplan and zanderthat generate contingent plans in probabilistic propositional domains. Both operate by transforming the planning problem into a stochastic satisfiability problem and solving that problem instead. cmaxplan encodes the problem as an EMajsat instance, while zander encodes the problem as an SSat instance. Although SSat problems are in a higher complexity class than EMajsat problems, the problem encodings produced by zander are substantially more compact and appear to be easier to solve than the corresponding EMajsat encodings. Preliminary results for zander indicate that it is competitive with existing planners on a variety of problems. Introduction When planning under uncertainty, any information about the state of the world is precious. A contingent plan is one that can make action choices contingent on such information. In this paper, we present an implemented framework for contingent pl...
Dynamic Programming for POMDPs using a Factored State Representation
 In Proceedings of the Fifth International Conference on AI Planning Systems
, 2000
"... Contingent planning  constructing a plan in which action selection is contingent on imperfect information received during plan execution  can be formalized as the problem of solving a partially observable Markov decision process (POMDP). Traditional dynamic programming algorithms for POMDPs ..."
Abstract

Cited by 66 (4 self)
 Add to MetaCart
(Show Context)
Contingent planning  constructing a plan in which action selection is contingent on imperfect information received during plan execution  can be formalized as the problem of solving a partially observable Markov decision process (POMDP). Traditional dynamic programming algorithms for POMDPs use a flat state representation that enumerates all possible states and state transitions. By contrast, AI planning algorithms use a factored state representation that supports state abstraction and allows problems with large state spaces to be represented and solved more efficiently. Boutilier and Poole (1996) have recently described how a factored state representation can be exploited by a dynamic programming algorithm for POMDPs. We extend their framework, describe an implementation and test its performance, and assess how much this approach improves the computational efficiency of dynamic programming for POMDPs. Introduction Many AI planning researchers have adopted Markov...