• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Active learning of dynamic bayesian networks in markov decision processes (2007)

by A Jonsson, A G Barto
Venue:In SARA
Add To MetaCart

Tools

Sorted by:
Results 1 - 4 of 4

Autonomous Hierarchical Skill Acquisition in Factored MDPs

by Christopher M. Vigorito, Andrew G. Barto
"... Abstract — Learning hierarchies of reusable skills is essential for efficiently solving multiple tasks in a given domain. Understanding the causal relationships between one’s actions and various dimensions of one’s environment can facilitate learning of abstract skills that may be used subsequently ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Abstract — Learning hierarchies of reusable skills is essential for efficiently solving multiple tasks in a given domain. Understanding the causal relationships between one’s actions and various dimensions of one’s environment can facilitate learning of abstract skills that may be used subsequently in related tasks. Using Bayesian network structure-learning techniques and structured dynamic programming algorithms, we show that reinforcement learning agents can learn incrementally and autonomously both the causal structure of their environment and useful skills that exploit this structure. As new structure is discovered, more complex skills are learned, which in turn allow the agent to discover more structure, and so on. Because of this bootstrapping property, our approach can be considered a developmental process that results in steadily increasing domain knowledge and behavioral complexity. I.

Autonomous Qualitative Learning of Distinctions and Actions in a Developing Agent

by Jonathan Mugan
"... How can an agent bootstrap up from a pixel-level representation to autonomously learn highlevel states and actions using only domain general knowledge? This thesis attacks a piece of this problem and assumes that an agent has a set of continuous variables describing the environment and a set of con ..."
Abstract - Add to MetaCart
How can an agent bootstrap up from a pixel-level representation to autonomously learn highlevel states and actions using only domain general knowledge? This thesis attacks a piece of this problem and assumes that an agent has a set of continuous variables describing the environment and a set of continuous motor primitives, and poses a solution for the problem of how an agent can learn a set of useful states and effective higher-level actions through autonomous experience with the environment. There exist methods for learning models of the environment, and there also exist methods for planning. However, for autonomous learning, these methods have been used almost exclusively in discrete environments. This thesis proposes attacking the problem of learning high-level states and actions in continuous environments by using a qualitative representation to bridge the gap between continuous and discrete variable representations. In this approach, the agent begins with a broad discretization and initially can only tell if the value of each variable is increasing, decreasing, or remaining steady. The agent then simultaneously learns a qualitative representation (discretization) and a set of predictive models of the environment. The agent then converts these models into plans to form actions. The agent then uses those learned actions to explore the environment. The method is evaluated using a simulated robot with realistic physics. The robot is sitting at a table that contains one or two blocks, as well as other distractor objects that are out of reach. The agent autonomously explores the environment without being given a task. After learning, the agent is given various tasks to determine if it learned the necessary states and actions to complete them. The results show that the agent was able to use this method to autonomously learn to perform the tasks.

IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT (T-AMD) 1 Intrinsically Motivated Hierarchical Skill Learning in Structured Environments

by Christopher M. Vigorito, Andrew G. Barto
"... Abstract—We present a framework for intrinsically motivated developmental learning of abstract skill hierarchies by reinforcement learning agents in structured environments. Long-term learning of skill hierarchies can drastically improve an agent’s efficiency in solving ensembles of related tasks in ..."
Abstract - Add to MetaCart
Abstract—We present a framework for intrinsically motivated developmental learning of abstract skill hierarchies by reinforcement learning agents in structured environments. Long-term learning of skill hierarchies can drastically improve an agent’s efficiency in solving ensembles of related tasks in a complex domain. In structured domains composed of many features, understanding the causal relationships between actions and their effects on different features of the environment can greatly facilitate skill learning. Using Bayesian network structure-learning techniques and structured dynamic programming algorithms, we show that reinforcement learning agents can learn incrementally and autonomously both the causal structure of their environment and a hierarchy of skills that exploit this structure. Furthermore, we present a novel active learning scheme that employs intrinsic motivation to maximize the efficiency with which this structure is learned. As new structure is acquired using an agent’s current set of skills, more complex skills are learned, which in turn allow the agent to discover more structure, and so on. This bootstrapping property makes our approach a developmental learning process that results in steadily increasing domain knowledge and behavioral complexity as an agent continues to explore its environment. Index Terms—reinforcement learning, planning, options, structure learning, active learning, intrinsic motivation

Active Learning of MDP models

by Mauricio Araya-lópez, Olivier Buffet, Vincent Thomas, François Charpillet
"... Abstract. We consider the active learning problem of inferring the transition model of a Markov Decision Process by acting and observing transitions. This is particularly useful when no reward function is a priori defined. Our proposal is to cast the active learning task as a utility maximization pr ..."
Abstract - Add to MetaCart
Abstract. We consider the active learning problem of inferring the transition model of a Markov Decision Process by acting and observing transitions. This is particularly useful when no reward function is a priori defined. Our proposal is to cast the active learning task as a utility maximization problem using Bayesian reinforcement learning with belief-dependent rewards. After presenting three possible performance criteria, we derive from them the belief-dependent rewards to be used in the decision-making process. As computing the optimal Bayesian value function is intractable for large horizons, we use a simple algorithm to approximately solve this optimization problem. Despite the sub-optimality of this technique, we show experimentally that our proposal is efficient in a number of domains. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University