Results 1 -
5 of
5
Exploration in Relational Domains for Model-based Reinforcement Learning
, 2012
"... A fundamental problem in reinforcement learning is balancing exploration and exploitation. We address this problem in the context of model-based reinforcement learning in large stochastic relational domains by developing relational extensions of the concepts of the E 3 and R-MAX algorithms. Efficien ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
A fundamental problem in reinforcement learning is balancing exploration and exploitation. We address this problem in the context of model-based reinforcement learning in large stochastic relational domains by developing relational extensions of the concepts of the E 3 and R-MAX algorithms. Efficient exploration in exponentially large state spaces needs to exploit the generalization of the learned model: what in a propositional setting would be considered a novel situation and worth exploration may in the relational setting be a well-known context in which exploitation is promising. To address this we introduce relational count functions which generalize the classical notion of state and action visitation counts. We provide guarantees on the exploration efficiency of our framework using count functions under the assumption that we had a relational KWIK learner and a near-optimal planner. We propose a concrete exploration algorithm which integrates a practically efficient probabilistic rule learner and a relational planner (for which there are no guarantees, however) and employs the contexts of learned relational rules as features to model the novelty of states and actions. Our results in noisy 3D simulated robot manipulation problems and in domains of the international planning competition demonstrate that our approach is more effective than existing propositional and factored exploration techniques.
Mean field inference in dependency networks: An empirical study
- In Proceedings of the Twenty-Fifth National Conference on Artificial Intelligence
, 2011
"... Dependency networks are a compelling alternative to Bayesian networks for learning joint probability distributions from data and using them to compute probabilities. A dependency network consists of a set of conditional probability distributions, each representing the probability of a single variabl ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
Dependency networks are a compelling alternative to Bayesian networks for learning joint probability distributions from data and using them to compute probabilities. A dependency network consists of a set of conditional probability distributions, each representing the probability of a single variable given its Markov blanket. Running Gibbs sampling with these conditional distributions produces a joint distribution that can be used to answer queries, but suffers from the traditional slowness of sampling-based inference. In this paper, we observe that the mean field update equation can be applied to dependency networks, even though the conditional probability distributions may be inconsistent with each other. In experiments with learning and inference on 12 datasets, we demonstrate that mean field inference in dependency networks offers similar accuracy to Gibbs sampling but with orders of magnitude improvements in speed. Compared to Bayesian networks learned on the same data, dependency networks offer higher accuracy at greater amounts of evidence. Furthermore, mean field inference is consistently more accurate in dependency networks than in Bayesian networks learned on the same data.
Statistical relational learning to predict primary myocardial infarction from electronic health records
- AAAI Conference on Innovative Applications in AI (IAAI
, 2012
"... Electronic health records (EHRs) are an emerging re-lational domain with large potential to improve clin-ical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradien ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Electronic health records (EHRs) are an emerging re-lational domain with large potential to improve clin-ical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradient boosting, outperforms propositional learners particularly in the medically-relevant high recall region. We observe that both SRL algorithms predict outcomes better than their propositional analogs and suggest how our methods can augment current epidemiological practices.
Imitation Learning in Relational Domains Using Functional Gradient Boosting
"... It is common knowledge that both humans and animals learn new skills by observing others. This problem, which is called imitation learning, can be formulated as learning a representation of a policy – a mapping from states to actions – from examples of that policy. Our focus is on relational domains ..."
Abstract
- Add to MetaCart
(Show Context)
It is common knowledge that both humans and animals learn new skills by observing others. This problem, which is called imitation learning, can be formulated as learning a representation of a policy – a mapping from states to actions – from examples of that policy. Our focus is on relational domains where states are naturally described by relations among an indefinite number of objects. Examples include real time strategy games such as Warcraft, regulation of traffic lights, logistics, and a variety of planning domains. In this work, we employ two ideas. First, instead of learning a deterministic policy to imitate the expert, we learn a stochastic policy where the probability of an action given a state is represented by a sum of potential functions. Second, we leverage the recently developed functional-gradient boosting approach to learn a set of regression trees, each of which represents a potential function. The functional gradient approach has been found to give state of the art results in many relational problems [6, 5]. Together, these two ideas allow us to overcome the limited representational capacity of earlier approaches, while also giving us an effective learning algorithm. Indeed, the functional-gradient approach to boosting has already been found to yield excellent results in imitation learning in robotics in propositional setting [7]. Functional Gradient Boosting:Assume that the training examples are of the form (xi, yi) for i = 1,..., N and yi ∈ {1,..., K}. The goal is to fit a model P (y|x) ∝ eψ(y,x). The standard method of supervised learning is based on gradient-descent where the learning algorithm starts with initial parameters θ0 and computes the gradient of the