Results 1  10
of
42
Learning symbolic models of stochastic domains
 Journal of Artificial Intelligence Research
"... In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models noisy, nondeterministic action effects, and show how such rules can be effectively learned. Through experi ..."
Abstract

Cited by 86 (3 self)
 Add to MetaCart
(Show Context)
In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models noisy, nondeterministic action effects, and show how such rules can be effectively learned. Through experiments in simple planning domains and a 3D simulated blocks world with realistic physics, we demonstrate that this learning algorithm allows agents to effectively model world dynamics. 1.
Exploiting FirstOrder Regression in Inductive Policy Selection
 Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI’04
, 2004
"... We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function usi ..."
Abstract

Cited by 47 (2 self)
 Add to MetaCart
We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function using firstorder decisiontheoretic regression and formula rewriting, while the former, when provided with a suitable hypotheses language, are capable of generalising value functions or policies for small instances. Our idea is to use reasoning and in particular classical firstorder regression to automatically generate a hypotheses language dedicated to the domain at hand, which is then used as input by an inductive solver. This approach avoids the more complex reasoning of symbolic dynamic programming while focusing the inductive solver’s attention on concepts that are specifically relevant to the optimal value function for the domain considered. 1
Sequential monte carlo in probabilistic planning reachability heuristics
 Artificial Intelligence
, 2008
"... The current best conformant probabilistic planners encode the problem as a bounded length CSP or SAT problem. While these approaches can find optimal solutions for given plan lengths, they often do not scale for large problems or plan lengths. As has been shown in classical planning, heuristic searc ..."
Abstract

Cited by 27 (15 self)
 Add to MetaCart
The current best conformant probabilistic planners encode the problem as a bounded length CSP or SAT problem. While these approaches can find optimal solutions for given plan lengths, they often do not scale for large problems or plan lengths. As has been shown in classical planning, heuristic search outperforms CSP/SAT techniques (especially when a plan length is not given a priori). The problem with applying heuristic search in probabilistic planning is that effective heuristics are as yet lacking. In this work, we apply heuristic search to conformant probabilistic planning by adapting planning graph heuristics developed for nondeterministic planning. We evaluate a straightforward application of these planning graph techniques, which amounts to exactly computing the distribution over reachable relaxed planning graph layers. Computing these distributions is costly, so we apply Sequential Monte Carlo to approximate them. We demonstrate on several domains how our approach enables our planner to far outscale existing (optimal) probabilistic planners and still find reasonable quality solutions.
Properties of Planning with NonMarkovian Rewards
 Journal of Artificial Intelligence Research
, 2006
"... We examine technologies designed to solve decision processes with nonMarkovian rewards (NMRDPs). More specifically, target decision processes exhibit Markovian dynamics, called grounded dynamics, and desirable behaviours are modelled as state trajectories specified in a temporal logic. ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
We examine technologies designed to solve decision processes with nonMarkovian rewards (NMRDPs). More specifically, target decision processes exhibit Markovian dynamics, called grounded dynamics, and desirable behaviours are modelled as state trajectories specified in a temporal logic.
Exploration of the robustness of plans
 In AAAI
, 2006
"... This paper considers the problem of stochastic robustness testing for plans. Although plan generation systems might be proven sound the resulting plans are valid only with respect to the abstract domain model. It is wellunderstood that unforseen executiontime variations, both in the effects of act ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
This paper considers the problem of stochastic robustness testing for plans. Although plan generation systems might be proven sound the resulting plans are valid only with respect to the abstract domain model. It is wellunderstood that unforseen executiontime variations, both in the effects of actions and in the times at which they occur, can result in a valid plan failing to execute correctly. Other authors have investigated the stochastic validity of plans with nondeterministic action outcomes. In this paper we focus on the uncertainty that arises as a result of inaccuracies in the measurement of time and other numeric quantities. We describe a probing strategy that produces a stochastic estimate of the robustness of a temporal plan. This strategy is based on Gupta, Henzinger and Jagadeesan’s (Gupta, Henzinger, & Jagadeesan 1997) notion of the “fuzzy ” robustness of traces through timed hybrid automata. 1
Sequential monte carlo in reachability heuristics for probabilistic planning
 Artif. Intell
, 2008
"... Some of the current best conformant probabilistic planners focus on finding a fixed length plan with maximal probability. While these approaches can find optimal solutions, they often do not scale for large problems or plan lengths. As has been shown in classical planning, heuristic search outperfor ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Some of the current best conformant probabilistic planners focus on finding a fixed length plan with maximal probability. While these approaches can find optimal solutions, they often do not scale for large problems or plan lengths. As has been shown in classical planning, heuristic search outperforms bounded length search (especially when an appropriate plan length is not given a priori). The problem with applying heuristic search in probabilistic planning is that effective heuristics are as yet lacking. In this work, we apply heuristic search to conformant probabilistic planning by adapting planning graph heuristics developed for nondeterministic planning. We evaluate a straightforward application of these planning graph techniques, which amounts to exactly computing a distribution over many relaxed planning graphs (one planning graph for each joint outcome of uncertain actions at each time step). Computing this distribution is costly, so we apply Sequential Monte Carlo (SMC) to approximate it. One important issue that we explore in this work is how to automatically determine the number of samples required for effective heuristic computation. We empirically demonstrate on several domains how our efficient, but sometimes suboptimal, approach enables our planner to solve much larger problems than an existing optimal bounded length probabilistic planner and still find reasonable quality solutions.
Using Interaction to Compute Better Probability Estimates in Plan Graphs
 Paper Presented at the Sixteenth International Conference on Automated Planning and Scheduling Workshop on Planning Under Uncertainty and Execution Control for Autonomous Systems, 6–10 June, The English Lake District, Cumbria
"... heuristic "distance" estimates between states and goals. A few authors have also attempted to use plan graphs in probabilistic planning to compute estimates of the probability that propositions can be achieved and actions can be performed. ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
heuristic "distance" estimates between states and goals. A few authors have also attempted to use plan graphs in probabilistic planning to compute estimates of the probability that propositions can be achieved and actions can be performed.
First order Markov Decision Processes
, 2007
"... Relational Markov Decision Processes (RMDP) are a useful abstraction for complex reinforcement solutions for them that are independent of domain size or instantiation. This thesis develops compact representations for RMDPs and exact solution methods for RMDPs using such representations. One of the c ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Relational Markov Decision Processes (RMDP) are a useful abstraction for complex reinforcement solutions for them that are independent of domain size or instantiation. This thesis develops compact representations for RMDPs and exact solution methods for RMDPs using such representations. One of the core contributions of the thesis is development of the First Order Decision Diagram (FODD), a representation that captures functions over relational structures, together with a set of operators to manipulate FODDs. FODDs offer a potentially compact representation for complex functions over relational structures and can therefore serve as underlying engine for efficient algorithms with relational structures. The second core contribution is developing exact solution methods for RMDPs based on FODD representations. In particular FODDs are used to represent value functions, transition probabilities, and domain dynamics of RMDPs. Special operations are developed to implement exact value iteration and a novel variant of policy iteration and the algorithms are shown to calculate optimal solutions for RMDPs. Finally we show how the algorithms for RMDPs using FODDs can be extended to handle relational Partially
Fast Approximate Hierarchical Solution of MDPs
"... In this thesis, we present an efficient algorithm for creating and solving hierarchical models of large Markov decision processes (MDPs). As the size of the MDP increases, finding an exact solution becomes intractable, so we expect only to find an approximate solution. We also assume that the hierar ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
In this thesis, we present an efficient algorithm for creating and solving hierarchical models of large Markov decision processes (MDPs). As the size of the MDP increases, finding an exact solution becomes intractable, so we expect only to find an approximate solution. We also assume that the hierarchies we create are not necessarily applicable to more than one problem so that we must be able to construct and solve the hierarchical model in less time than it would have taken to simply solve the original, flat model. Our approach works in two stages. We first create the hierarchical MDP by forming clusters of states that can transition easily among themselves. We then solve the hierarchical MDP. We use a quick bottomup pass based on a deterministic approximation of expected costs to move from one state to another to derive a policy from the top down, which avoids solving lowlevel MDPs for multiple objectives. The