## Relevance Grounding for Planning in Relational Domains

Citations: | 4 - 3 self |

### BibTeX

@MISC{Lang_relevancegrounding,

author = {Tobias Lang and Marc Toussaint},

title = {Relevance Grounding for Planning in Relational Domains},

year = {}

}

### OpenURL

### Abstract

Abstract. Probabilistic relational models are an efficient way to learn and represent the dynamics in realistic environments consisting of many objects. Autonomous intelligent agents that ground this representation for all objects need to plan in exponentially large state spaces and large sets of stochastic actions. A key insight for computational efficiency is that successful planning typically involves only a small subset of relevant objects. In this paper, we introduce a probabilistic model to represent planning with subsets of objects and provide a definition of object relevance. Our definition is sufficient to prove consistency between repeated planning in partially grounded models restricted to relevant objects and planning in the fully grounded model. We propose an algorithm that exploits object relevance to plan efficiently in complex domains. Empirical results in a simulated 3D blocksworld with an articulated manipulator and realistic physics prove the effectiveness of our approach. 1

### Citations

535 |
R.: Rules of the mind
- Anderson
- 1993
(Show Context)
Citation Context ...nition provide an inspiring idea of how one may plan in a highly complex world. Humans are often assumed to possess declarative world knowledge about the types of objects they encounter in daily life =-=[1]-=- which is presumably stored in the long-term memory. For instance, they know that piling dishes succeeds the better the more exactly aligned these dishes are. This abstract knowledge is independent of... |

421 | Decision-Theoretic Planning: Structural Assumptions and Computational Leverage
- Boutilier, Dean, et al.
- 1999
(Show Context)
Citation Context ... to deriving plans for a given start state. When grounding the full model, one might in principle use any of the traditional A.I. planning methods used for propositional representations, see [20] and =-=[4]-=-. An interesting strategy to work in a grounded model in a principled way is to consider only a small relevant subset of the state space which is derived from the start state and the planning goal. In... |

171 |
The episodic buffer: a new component of working memory
- Baddeley
- 2000
(Show Context)
Citation Context ...ng memory, a cognitive system functioning as a work-space in which recently acquired sensory information and information from long-term memory are processed for further action such as decision-making =-=[2]-=- [16]. This system has limited capacity and humans can only take some selected objects into account – those they deem relevant for the problem at hand. For example, when planning to prepare a cup of t... |

134 | Symbolic dynamic programming for first-order MDPs
- Boutilier, Reiter, et al.
- 2001
(Show Context)
Citation Context ... in the “lifted” abstract representation without grounding or referring to particular problem instances. This requires the learned (or prespecified) model to be complete. Symbolic Dynamic Programming =-=[5]-=- investigates exact solution methods for relational MDPs. The idea is to construct minimal logical partitions of the state space required to make all necessary value function distinctions. For example... |

108 | Recent advances in ai planning
- Weld
- 1999
(Show Context)
Citation Context ...t oneself to deriving plans for a given start state. When grounding the full model, one might in principle use any of the traditional A.I. planning methods used for propositional representations, see =-=[20]-=- and [4]. An interesting strategy to work in a grounded model in a principled way is to consider only a small relevant subset of the state space which is derived from the start state and the planning ... |

103 | Relational reinforcement learning
- Dzeroski, Raedt, et al.
- 1998
(Show Context)
Citation Context ...is to describe important world features in terms of abstract logical formulas enabling generalization over objects and situations. Examples of model-free approaches employ relational regression trees =-=[8]-=- or instance-based regression using distance metrices between relational states such as graph kernels [7] to learn Q-functions. Modelfree approaches have the disadvantage to be inflexible as they enab... |

44 | Learning symbolic models of stochastic domains
- Pasula, Zettlemoyer, et al.
- 2007
(Show Context)
Citation Context ...d functions based on abstract logical formulae [6] and probabilistic relational rules, e.g. in the form of STRIPS-operators. An example of the latter are the noisy indeterministic deictic (NID) rules =-=[15]-=- which will be our running example in this paper and which we briefly review here. A NID rule r is given as follows ar(X ): Φr(X ) → ⎧ ⎪⎨ ⎪⎩ pr,1 : Ωr,1(X ) . . pr,mr : Ωr,mr(X ) pr,0 : Ωr,0 , (1) whe... |

43 | Probabilistic inference for solving discrete and continuous state markov decision processes
- Toussaint, Storkey
- 2006
(Show Context)
Citation Context ...P (a; M) (2) 1 When we assume a geometric prior on the trial length, the expected reward is equivalent to the sum of discounted rewards when rewards are given in each time-step – see Toussaint et al. =-=[18]-=- for details.740 T. Lang and M. Toussaint Note that all conditional distributions depend on the model M. In absence of goals or rewards, we assume a uniform prior over plans P (a; M), discounted by t... |

40 | Graph kernels and gaussian processes for relational reinforcement learning
- Drieesens, Ramon, et al.
- 2006
(Show Context)
Citation Context ...r objects and situations. Examples of model-free approaches employ relational regression trees [8] or instance-based regression using distance metrices between relational states such as graph kernels =-=[7]-=- to learn Q-functions. Modelfree approaches have the disadvantage to be inflexible as they enable planning only for the specific problem type used in the training examples. In contrast, model-based RR... |

24 | Envelope-based planning in relational MDPs
- Gardiol, Kaelbling
- 2003
(Show Context)
Citation Context ...t estimate the value of an action by taking samples of the corresponding successor state distribution [15]. Another idea is to maintain an envelope of states, a high-utility subset of the state space =-=[9]-=- which can be used to define a relational MDP. This envelope can be further refined by incorporating nearby states in order to improve planning quality. A crucial part of this approach is the initiali... |

18 |
Working memory retention systems: A state of activated long-term memory
- Ruchkin, Grafman, et al.
- 2003
(Show Context)
Citation Context ...emory, a cognitive system functioning as a work-space in which recently acquired sensory information and information from long-term memory are processed for further action such as decision-making [2] =-=[16]-=-. This system has limited capacity and humans can only take some selected objects into account – those they deem relevant for the problem at hand. For example, when planning to prepare a cup of tea, t... |

15 | Online learning and exploiting relational models in reinforcement learning
- Croonenborghs, Ramon, et al.
- 2007
(Show Context)
Citation Context ...ground T with respect to some of the objects in the domain. Examples of abstract transition models include relational probability trees for predicates and functions based on abstract logical formulae =-=[6]-=- and probabilistic relational rules, e.g. in the form of STRIPS-operators. An example of the latter are the noisy indeterministic deictic (NID) rules [15] which will be our running example in this pap... |

12 | Goaldirected decision making in prefrontal cortex: a computational framework
- Botvinick, An
- 2008
(Show Context)
Citation Context ... objects (such as lamps and cars) and is akin to the abstract probabilistic relational models in A.I.. When planning, human beings may reason about objects according to their abstract world knowledge =-=[3]-=-, i.e., they ground their abstract world model with respect to these objects. Such reasoning is often assumed to take place in the working memory, a cognitive system functioning as a work-space in whi... |

11 |
Raedt. Bellman goes relational
- Kersting, Otterlo, et al.
- 2004
(Show Context)
Citation Context ...t solution methods for relational MDPs. The idea is to construct minimal logical partitions of the state space required to make all necessary value function distinctions. For example, Kersting et al. =-=[13]-=- present an exact value iteration for relational MDPs. Sanner et al. [17] exploit factored transition models of first-order MPDs to approximate the value function based on linear combinations of abstr... |

11 | Approximate solution techniques for factored first-order MDPs
- Sanner, Boutilier
- 2007
(Show Context)
Citation Context ...logical partitions of the state space required to make all necessary value function distinctions. For example, Kersting et al. [13] present an exact value iteration for relational MDPs. Sanner et al. =-=[17]-=- exploit factored transition models of first-order MPDs to approximate the value function based on linear combinations of abstract first-order value functions. Their work shows that under certain assu... |

10 | Approximate inference for planning in stochastic relational worlds
- Lang, Toussaint
- 2009
(Show Context)
Citation Context ...or the planning problem into account. In the next section, we will introduce Relevance Grounding which formalizes this idea in a systematic way. To plan in grounded models, we use the PRADA algorithm =-=[14]-=- in this paper. PRADA converts NID rules into dynamic Bayesian networks, predicts the effects of action sequences on states and rewards by means of approximate inference and samples action sequences i... |

8 | Action-space partitioning for planning
- Gardiol, Kaelbling
- 2007
(Show Context)
Citation Context ...ion space complexity can be decreased by noting that if the identities of objects do not matter but only their relationships, then different equivalent actions may lead to equivalent successor states =-=[10]-=-. These are states where the750 T. Lang and M. Toussaint same relationships hold, but not necessarily with the same objects. Relevance grounding accounts for this idea by defining different object su... |

4 | Learning models of relational MDPs using graph kernels
- Halbritter, Geibel
- 2007
(Show Context)
Citation Context ...Relevance Grounding for Planning in Relational Domains 749 transition experiences, for example in form of relational probability trees for individual state properties [6] or SVMs using graph kernels =-=[12]-=-. One way to make use of the resulting model is to sample look-ahead trees of state transitions in Q-learning, i.e., to work with ground states. All approaches discussed thus far make use of ground st... |

2 |
Adaptive envelope MDPs for relational equivalence-based planning
- Gardiol, Kaelbling
- 2008
(Show Context)
Citation Context ...approach does not yield significant improvements. Another way to reduce the state space complexity is to look only at a subset of the logical vocabulary, i.e., ignore certain predicates and functions =-=[11]-=-. This helps when combined with the action equivalence approach as state descriptions become shorter and more approximate and the number of state equivalences increases. All these methods just discuss... |

2 |
The Logic of Adaptive Behavior
- Otterlo
- 2009
(Show Context)
Citation Context ... ± 2.14 5893.81 ± 1006.56 (c) 5 Related Work The problem of planning in stochastic relational domains has been approached in quite different ways. The field of Relational Reinforcement Learning (RRL) =-=[19]-=- investigates value functions and Q-functions that are defined over all possible ground states and actions of a relational domain. The idea is to describe important world features in terms of abstract... |