Results 1 
6 of
6
Efficient Solution Algorithms for Factored MDPs
, 2003
"... This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the re ..."
Abstract

Cited by 129 (4 self)
 Add to MetaCart
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (MDPs). Factored MDPs represent a complex state space using state variables and the transition model using a dynamic Bayesian network. This representation often allows an exponential reduction in the representation size of structured MDPs, but the complexity of exact solution algorithms for such MDPs can grow exponentially in the representation size. In this paper, we present two approximate solution algorithms that exploit structure in factored MDPs. Both use an approximate value function represented as a linear combination of basis functions, where each basis function involves only a small subset of the domain variables. A key contribution of this paper is that it shows how the basic operations of both algorithms can be performed efficiently in closed form, by exploiting both additive and contextspecific structure in a factored MDP. A central element of our algorithms is a novel linear program decomposition technique, analogous to variable elimination in Bayesian networks, which reduces an exponentially large LP to a provably equivalent, polynomialsized one. One algorithm uses approximate linear programming, and the second approximate dynamic programming. Our dynamic programming algorithm is novel in that it uses an approximation based on maxnorm, a technique that more directly minimizes the terms that appear in error bounds for approximate MDP algorithms. We provide experimental results on problems with over 10^40 states, demonstrating a promising indication of the scalability of our approach, and compare our algorithm to an existing stateoftheart approach, showing, in some problems, exponential gains in computation time.
Approximation in Normed Linear Spaces
, 2000
"... A historical account is given of the development of methods for solving approximation problems set in normed linear spaces. Approximation of both real functions and real data is considered, with particular reference to L p (or l p ) and Chebyshev norms. As well as coverage of methods for the usu ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
A historical account is given of the development of methods for solving approximation problems set in normed linear spaces. Approximation of both real functions and real data is considered, with particular reference to L p (or l p ) and Chebyshev norms. As well as coverage of methods for the usual linear problems, an account is given of the development of methods for approximation by functions which are nonlinear in the free parameters, and special attention is paid to some particular nonlinear approximating families. 1 Introduction The purpose of this paper is to give a historical account of the development of numerical methods for a range of problems in best approximation, that is problems which involve the minimization of a norm. A treatment is given of approximation of both real functions and data. For the approximation of functions, the emphasis is on the use of the Chebyshev norm, while for data approximation, we consider a wider range of criteria, including the other l ...
Maxnorm Projections for Factored MDPs
 In IJCAI
, 2001
"... Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. ..."
Abstract
 Add to MetaCart
Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty.
unknown title
"... Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. However, exact MDP solution algorithms require the manipulation of a value function, which specifies a value for each state in the system. Most realworld MDPs are too large for such a represen ..."
Abstract
 Add to MetaCart
Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. However, exact MDP solution algorithms require the manipulation of a value function, which specifies a value for each state in the system. Most realworld MDPs are too large for such a representation to be feasible, preventing the use of exact MDP algorithms. Various approximate solution algorithms have been proposed, many of which use a linear combination of basis functions as a compact approximation to the value function. Almost all of these algorithms use an approximation based on the (weighted)norm (Euclidean distance); this approach prevents the application of standard convergence results for MDP algorithms, all of which are based on maxnorm. This paper makes two contributions. First, it presents the first approximate MDP solution algorithms — both value and policy iteration — that use maxnorm projection, thereby directly optimizing the quantity required to obtain the best error bounds. Second, it shows how these algorithms can be applied efficiently in the context of factored MDPs, where the transition model is specified using a dynamic Bayesian network. 1
koller @ cs.stanford.edu
"... parr @ cs.duke.edu Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. However, exact MDP solution algorithms require the manipulation of a value function, which specifies a value for each state in the system. Most realworld MDPs are too large ..."
Abstract
 Add to MetaCart
parr @ cs.duke.edu Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. However, exact MDP solution algorithms require the manipulation of a value function, which specifies a value for each state in the system. Most realworld MDPs are too large for such a representation to be feasible, preventing the use of exact MDP algorithms. Various approximate solution algorithms have been proposed, many of which use a linear combination of basis functions to provide a compact approximation to the value function. Almost all of these algorithms use an approximation based on the (weighted) Z~2norm (Euclidean distance); this approach prevents the application of standard convergence results for MDP algorithms, all of which use maxnorm. This paper makes two contributions. First, it presents the first approximate MDP solution algorithms both value and policy iteration that use maxnorm projection, thereby directly optimizing the quantity required to obtain the best error bounds. Second, it shows how these algorithms can be applied efficiently in the context of factored MDPs, where the transition model is specified using a dynamic Bayesian network and actions may be taken sequentially or in parallel. 1