Results 1  10
of
33
Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes
, 2005
"... Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithm ..."
Abstract

Cited by 91 (6 self)
 Add to MetaCart
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithms, which can only solve problems with up to ten thousand states. In fact, the complexity of finding an optimal policy for a finitehorizon discrete POMDP is PSPACEcomplete. In practice, two important sources of intractability plague most solution algorithms: large policy spaces and large state spaces. On the other hand,
Valuedirected Compression of POMDPs
 In NIPS 15
, 2002
"... We examine the problem of generating statespace compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compressions on decision quality, observing that compressions that allow accurate policy evaluation (prediction of expected future reward) will not af ..."
Abstract

Cited by 71 (4 self)
 Add to MetaCart
(Show Context)
We examine the problem of generating statespace compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compressions on decision quality, observing that compressions that allow accurate policy evaluation (prediction of expected future reward) will not affect decision quality.
Solving factored MDPs with hybrid state and action variables
 J. Artif. Intell. Res. (JAIR
"... Efficient representations and solutions for large decision problems with continuous and discrete variables are among the most important challenges faced by the designers of automated decision support systems. In this paper, we describe a novel hybrid factored Markov decision process (MDP) model tha ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
(Show Context)
Efficient representations and solutions for large decision problems with continuous and discrete variables are among the most important challenges faced by the designers of automated decision support systems. In this paper, we describe a novel hybrid factored Markov decision process (MDP) model that allows for a compact representation of these problems, and a new hybrid approximate linear programming (HALP) framework that permits their efficient solutions. The central idea of HALP is to approximate the optimal value function by a linear combination of basis functions and optimize its weights by linear programming. We analyze both theoretical and computational aspects of this approach, and demonstrate its scaleup potential on several hybrid optimization problems. 1.
Piecewise Linear Value Function Approximation for Factored MDPs
 In Proceedings of the Eighteenth National Conference on AI
, 2002
"... A number of proposals have been put forth in recent years for the solution of Markov decision processes (MDPs) whose state (and sometimes action) spaces are factored. ..."
Abstract

Cited by 28 (6 self)
 Add to MetaCart
A number of proposals have been put forth in recent years for the solution of Markov decision processes (MDPs) whose state (and sometimes action) spaces are factored.
Practical solution techniques for firstorder mdps
 Artificial Intelligence
"... Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approa ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
(Show Context)
Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these grounded solution approaches are polynomial in the number of domain objects and exponential in the predicate arity and the number of nested quantifiers in the relational problem specification. An alternative to grounding a relational planning problem is to tackle the problem directly at the relational level. In this article, we propose one such approach that translates an expressive subset of the PPDDL representation to a firstorder MDP (FOMDP) specification and then derives a domainindependent policy without grounding at any intermediate step. However, such generality does not come without its own set of challenges—the purpose of this article is to explore practical solution techniques for solving FOMDPs. To demonstrate the applicability of our techniques, we present proofofconcept results of our firstorder approximate linear programming (FOALP) planner on problems from the probabilistic track
Approximate Dynamic Programming with Applications in MultiAgent Systems
, 2007
"... This thesis presents the development and implementation of approximate dynamic programming methods used to manage multiagent systems. The purpose of this thesis is to develop an architectural framework and theoretical methods that enable an autonomous mission system to manage realtime multiagent ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
(Show Context)
This thesis presents the development and implementation of approximate dynamic programming methods used to manage multiagent systems. The purpose of this thesis is to develop an architectural framework and theoretical methods that enable an autonomous mission system to manage realtime multiagent operations. To meet this goal, we begin by discussing aspects of the realtime multiagent mission problem. Next, we formulate this problem as a Markov Decision Process (MDP) and present a system architecture designed to improve missionlevel functional reliability through system selfawareness and adaptive mission planning. Since most multiagent mission problems are computationally difficult to solve in realtime, approximation techniques are needed to find policies for these largescale problems. Thus, we have developed
Factored value iteration converges
 Acta Cyb
"... Abstract. In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solution of factored Markov decision processes (fMDPs). The traditional approximate value iteration algorithm is modified in two ways. For one, the leastsquares projection operator is modified ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solution of factored Markov decision processes (fMDPs). The traditional approximate value iteration algorithm is modified in two ways. For one, the leastsquares projection operator is modified so that it does not increase maxnorm, and thus preserves convergence. The other modification is that we uniformly sample polynomially many samples from the (exponentially large) state space. This way, the complexity of our algorithm becomes polynomial in the size of the fMDP description length. We prove that the algorithm is convergent. We also derive an upper bound on the difference between our approximate solution and the optimal one, and also on the error introduced by sampling. We analyze various projection operators with respect to their computation complexity and their convergence when combined with approximate value iteration. factored Markov decision process, value iteration, reinforcement learning 1.
Solving factored MDPs with exponentialfamily transition models
 In Proceedings of the 16th International Conference on Automated Planning and Scheduling (ICAPS
, 2006
"... Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea of the approach is to approximate the optimal value function by a linear combination of basis functions and optimize it b ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea of the approach is to approximate the optimal value function by a linear combination of basis functions and optimize it by linear programming. In this paper, we extend the existing HALP paradigm beyond the mixture of beta transition model. As a result, we permit modeling of other transition functions, such as normal and gamma densities, without approximating them. To allow for efficient solutions to the expectation terms in HALP, we identify a rich class of conjugate basis functions. Finally, we demonstrate the generalized HALP framework on a rover planning problem, which exhibits continuous time and resource uncertainty.
Firstorder decisiontheoretic planning in structured relational environments
, 2008
"... We consider the general framework of firstorder decisiontheoretic planning in structured relational environments. Most traditional solution approaches to these planning problems ground the relational specification w.r.t. a specific domain instantiation and apply a solution approach directly to the ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
We consider the general framework of firstorder decisiontheoretic planning in structured relational environments. Most traditional solution approaches to these planning problems ground the relational specification w.r.t. a specific domain instantiation and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these solution algorithms scale linearly with the domain size in the best case and exponentially in the worst case. An alternate approach to grounding a relational planning problem is to lift it to a firstorder MDP (FOMDP) specification. This FOMDP can then be solved directly, resulting in a domainindependent solution whose space and time complexity either do not scale with domain size or can scale sublinearly in the domain size. However, such generality does not come without its own set of challenges and the first purpose of this thesis is to explore exact and approximate solution techniques for practically solving FOMDPs. The second purpose of this thesis is to extend the FOMDP specification to succinctly capture factored actions and additive rewards while extending the exact and approximate solution techniques to directly exploit this structure. In addition, we provide a proof of correctness of the firstorder symbolic dynamic programming approach w.r.t. its wellstudied ground MDP
Symmetric primaldual approximate linear programming for factored MDPs
 In Proceedings of the Ninth International Symposiums on Artificial Intelligence and Mathematics (AI&M 2006
, 2006
"... A weakness of classical Markov decision processes is that they scale very poorly due to the flat statespace representation. Factored MDPs address this representational problem by exploiting problem structure to specify the transition and reward functions of an MDP in a compact manner. However, in g ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
A weakness of classical Markov decision processes is that they scale very poorly due to the flat statespace representation. Factored MDPs address this representational problem by exploiting problem structure to specify the transition and reward functions of an MDP in a compact manner. However, in general, solutions to factored MDPs do not retain the structure and compactness of the problem representation, forcing approximate solutions, with approximate linear programming (ALP) emerging as a very promising MDPapproximation technique. To date, most ALP work has focused on the primalLP formulation, while the dual LP, which forms the basis for solving constrained Markov problems, has received much less attention. We show that a straightforward linear approximation of the dual optimization variables is problematic, because some of the required computations cannot be carried out efficiently. Nonetheless, we develop a composite approach that symmetrically approximates the primal and dual optimization variables (effectively approximating both the objective function and the feasible region of the LP) that is computationally feasible and suitable for solving constrained MDPs. We empirically show that this new ALP formulation also performs well on unconstrained problems. 1.