Results 1  10
of
34
Approximate linear programming for firstorder MDPs
 In Proc. UAI05, 509– 517
, 2005
"... We introduce a new approximate solution technique for firstorder Markov decision processes (FOMDPs). Representing the value function linearly w.r.t. a set of firstorder basis functions, we compute suitable weights by casting the corresponding optimization as a firstorder linear program and show h ..."
Abstract

Cited by 27 (9 self)
 Add to MetaCart
We introduce a new approximate solution technique for firstorder Markov decision processes (FOMDPs). Representing the value function linearly w.r.t. a set of firstorder basis functions, we compute suitable weights by casting the corresponding optimization as a firstorder linear program and show how offtheshelf theorem prover and LP software can be effectively used. This technique allows one to solve FOMDPs independent of a specific domain instantiation; furthermore, it allows one to determine bounds on approximation error that apply equally to all domain instantiations. We apply this solution technique to the task of elevator scheduling with a rich feature space and multicriteria additive reward, and demonstrate that it outperforms a number of intuitive, heuristicallyguided policies. 1
Practical solution techniques for firstorder mdps
 Artificial Intelligence
"... Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approa ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Many traditional solution approaches to relationally specified decisiontheoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these grounded solution approaches are polynomial in the number of domain objects and exponential in the predicate arity and the number of nested quantifiers in the relational problem specification. An alternative to grounding a relational planning problem is to tackle the problem directly at the relational level. In this article, we propose one such approach that translates an expressive subset of the PPDDL representation to a firstorder MDP (FOMDP) specification and then derives a domainindependent policy without grounding at any intermediate step. However, such generality does not come without its own set of challenges—the purpose of this article is to explore practical solution techniques for solving FOMDPs. To demonstrate the applicability of our techniques, we present proofofconcept results of our firstorder approximate linear programming (FOALP) planner on problems from the probabilistic track
First order decision diagrams for relational MDPs
 In Proceedings of the International Joint Conference of Artificial Intelligence
, 2007
"... Markov decision processes capture sequential decision making under uncertainty, where an agent must choose actions so as to optimize long term reward. The paper studies efficient reasoning mechanisms for Relational Markov Decision Processes (RMDP) where world states have an internal relational struc ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
Markov decision processes capture sequential decision making under uncertainty, where an agent must choose actions so as to optimize long term reward. The paper studies efficient reasoning mechanisms for Relational Markov Decision Processes (RMDP) where world states have an internal relational structure that can be naturally described in terms of objects and relations among them. Two contributions are presented. First, the paper develops First Order Decision Diagrams (FODD), a new compact representation for functions over relational structures, together with a set of operators to combine FODDs, and novel reduction techniques to keep the representation small. Second, the paper shows how FODDs can be used to develop solutions for RMDPs, where reasoning is performed at the abstract level and the resulting optimal policy is independent of domain size (number of objects) or instantiation. In particular, a variant of the value iteration algorithm is developed by using special operations over FODDs, and the algorithm is shown to converge to the optimal policy. 1.
ReTrASE: Integrating Paradigms for Approximate Probabilistic Planning
"... Past approaches for solving MDPs have several weaknesses: 1) Decisiontheoretic computation over the state space can yield optimal results but scales poorly. 2) Valuefunction approximation typically requires humanspecified basis functions and has not been shown successful on nominal (“discrete”) d ..."
Abstract

Cited by 16 (11 self)
 Add to MetaCart
Past approaches for solving MDPs have several weaknesses: 1) Decisiontheoretic computation over the state space can yield optimal results but scales poorly. 2) Valuefunction approximation typically requires humanspecified basis functions and has not been shown successful on nominal (“discrete”) domains such as those in the ICAPS planning competitions. 3) Replanning by applying a classical planner to a determinized domain model can generate approximate policies for very large problems but has trouble handling probabilistic subtlety [Little and Thiebaux, 2007]. This paper presents RETRASE, a novel MDP solver, which combines decision theory, function approximation and classical planning in a new way. RETRASE uses classical planning to create basis functions for valuefunction approximation and applies expectedutility analysis to this compact space. Our algorithm is memoryefficient and fast (due to its compact, approximate representation), returns highquality solutions (due to the decisiontheoretic framework) and does not require additional knowledge from domain engineers (since we apply classical planning to automatically construct the basis functions). Experiments demonstrate that RETRASE outperforms winners from the past three probabilisticplanning competitions on many hard problems.
Approximate solution techniques for factored firstorder MDPs
 In (ICAPS07), 288
, 2007
"... Most traditional approaches to probabilistic planning in relationally specified MDPs rely on grounding the problem w.r.t. specific domain instantiations, thereby incurring a combinatorial blowup in the representation. An alternative approach is to lift a relational MDP to a firstorder MDP (FOMDP) sp ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Most traditional approaches to probabilistic planning in relationally specified MDPs rely on grounding the problem w.r.t. specific domain instantiations, thereby incurring a combinatorial blowup in the representation. An alternative approach is to lift a relational MDP to a firstorder MDP (FOMDP) specification and develop solution approaches that avoid grounding. Unfortunately, stateoftheart FOMDPs are inadequate for specifying factored transition models or additive rewards that scale with the domain size—structure that is very natural in probabilistic planning problems. To remedy these deficiencies, we propose an extension of the FOMDP formalism known as a factored FOMDP and present generalizations of symbolic dynamic programming and linearvalue approximation solutions to exploit its structure. Along the way, we also make contributions to the field of firstorder probabilistic inference (FOPI) by demonstrating novel firstorder structures that can be exploited without domain grounding. We present empirical results to demonstrate that we can obtain solutions whose complexity scales polynomially in the logarithm of the domain size—results that are impossible to obtain with any previously proposed solution method.
Classical Planning in MDP Heuristics: with a Little Help from Generalization
"... Heuristic functions make MDP solvers practical by reducing their time and memory requirements. Some of the most effective heuristics (e.g., the FF heuristic function) first determinize the MDP and then solve a relaxation of the resulting classical planning problem (e.g., by ignoring delete effects). ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
Heuristic functions make MDP solvers practical by reducing their time and memory requirements. Some of the most effective heuristics (e.g., the FF heuristic function) first determinize the MDP and then solve a relaxation of the resulting classical planning problem (e.g., by ignoring delete effects). While these heuristic functions are fast to compute, they frequently yield overly optimistic value estimates. It is natural to wonder, then, whether the improved estimates of using a full classical planner on the (nonrelaxed) determinized domain will provide enough gains to compensate for the vastly increased cost of computation. This paper shows that the answer is “No and Yes”. If one uses a full classical planner in the obvious way, the cost of the heuristic function’s computation outweighs the benefits. However, we show that one can make the idea practical by generalizing the results of classical planning successes and failures. Specifically, we introduce a novel heuristic function called GOTH that amortizes the cost of classical planning by 1) extracting basis functions from the plans discovered during heuristic computation, 2) using these basis functions to generalize the heuristic value of one state to cover many others, and 3) thus invoking the classical planner many fewer times than there are states. Experiments show that GOTH can provide vast time and memory savings compared to the FF heuristic function — especially on large problems.
A Heuristic Search Algorithm for Solving FirstOrder MDPs
 In Proc. Conference on Uncertainty in Artificial Intelligence (UAI
, 2005
"... We present a heuristic search algorithm for solving firstorder MDPs (FOMDPs). Our approach combines firstorder state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, we apply state abstraction directly on the FOMDP avoiding pr ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
We present a heuristic search algorithm for solving firstorder MDPs (FOMDPs). Our approach combines firstorder state abstraction that avoids evaluating states individually, and heuristic search that avoids evaluating all states. Firstly, we apply state abstraction directly on the FOMDP avoiding propositionalization. Such kind of abstraction is referred to as firstorder state abstraction. Secondly, guided by an admissible heuristic, the search is restricted only to those states that are reachable from the initial state. We demonstrate the usefullness of the above techniques for solving FOMDPs on a system, referred to as FCPlanner, that entered the probabilistic track of the International Planning Competition (IPC’2004). 1
SixthSense: Fast and reliable recognition of dead ends in MDPs
 In submission
, 2010
"... The results of the latest International Probabilistic Planning Competition (IPPC2008) indicate that the presence of dead ends, states with no trajectory to the goal, makes MDPs hard for modern probabilistic planners. Implicit dead ends, states with executable actions but no path to the goal, are pa ..."
Abstract

Cited by 10 (9 self)
 Add to MetaCart
The results of the latest International Probabilistic Planning Competition (IPPC2008) indicate that the presence of dead ends, states with no trajectory to the goal, makes MDPs hard for modern probabilistic planners. Implicit dead ends, states with executable actions but no path to the goal, are particularly challenging; existing MDP solvers spend much time and memory identifying these states. As a first attempt to address this issue, we propose a machine learning algorithm called SIXTHSENSE. SIXTHSENSE helps existing MDP solvers by finding nogoods, conjunctions of literals whose truth in a state implies that the state is a dead end. Importantly, our learned nogoods are sound, and hence the states they identify are true dead ends. SIXTHSENSE is very fast, needs little training data, and takes only a small fraction of total planning time. While IPPC problems may have millions of dead ends, they may typically be represented with only a dozen or two nogoods. Thus, nogood learning efficiently produces a quick and reliable means for deadend recognition. Our experiments show that the nogoods found by SIXTHSENSE routinely reduce planning space and time on IPPC domains, enabling some planners to solve problems they could not previously handle.
Selftaught decision theoretic planning with first order decision diagrams
 In Proceedings of ICAPS10
, 2010
"... We present a new paradigm for planning by learning, where the planner is given a model of the world and a small set of states of interest, but no indication of optimal actions in these states. The additional information can help focus the planner on regions of the state space that are of interest an ..."
Abstract

Cited by 9 (8 self)
 Add to MetaCart
We present a new paradigm for planning by learning, where the planner is given a model of the world and a small set of states of interest, but no indication of optimal actions in these states. The additional information can help focus the planner on regions of the state space that are of interest and lead to improved performance. We demonstrate this idea by introducing novel modelchecking reduction operations for First Order Decision Diagrams (FODD), a representation that has been used to implement decisiontheoretic planning with Relational Markov Decision Processes (RMDP). Intuitively, these reductions modify the construction of the value function by removing any complex specifications that are irrelevant to the set of training examples, thereby focusing on the region of interest. We show that such training examples can be constructed on the fly from a description of the planning problem thus we can bootstrap to get a selftaught planning system. Additionally, we provide a new heuristic to embed universal and conjunctive goals within the framework of RMDP planners, expanding the scope and applicability of such systems. We show that these ideas lead to significant improvements in performance in terms of both speed and coverage of the planner, yielding state of the art planning performance on problems from the International Planning Competition.
Policy Iteration for Relational MDPs
, 2007
"... Relational Markov Decision Processes are a useful ..."