Results 1  10
of
18
Learning partially observable deterministic action models
 In Proc. Nineteenth International Joint Conference on Artificial Intelligence (IJCAI ’05
, 2005
"... We present exact algorithms for identifying deterministicactions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenari ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
We present exact algorithms for identifying deterministicactions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AIplanning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventuregame playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis. 1.
Exponential family predictive representations of state
 In Neural Information Processing Systems (NIPS
"... 2008 To my wife, Martha. ii Acknowledgments This work would not have been possible without generous help, both intellectually and financially. I am grateful to my advisor, Satinder Singh, for the long discussions we have had as he has patiently taught me to think clearly through my own ideas, sharpe ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
2008 To my wife, Martha. ii Acknowledgments This work would not have been possible without generous help, both intellectually and financially. I am grateful to my advisor, Satinder Singh, for the long discussions we have had as he has patiently taught me to think clearly through my own ideas, sharpen my writing, and to raise my sights. A special thanks also to my lab mates, Matt Rudary, Britton Wolfe, Vishal Soni, Erik Talviti, Jonathan Sorg and Ishan Chaudhuri for always letting me bounce ideas around, for listening, and for patient tutoring. Thanks to Andrew Nuxoll for being a kindred spirit, to Nick Gorski for the occasional foosball game and to my collaborators at the University of Alberta. Finally, I would like to gratefully acknowledge the National Science Foundation for financially supporting me through most of my studies with a Graduate Research Fellowship. Finally, a special thank you to my wife Martha for her love, her constancy, her feistiness and for always keeping me on the straight and narrow. Thank you, Grace, Peterson and Andrew for reminding
PACLearning of Markov Models with Hidden State
 In Proceedings of ECML
, 2006
"... Abstract. The standard approach for learning Markov Models with Hidden State uses the ExpectationMaximization framework. While this approach had a significant impact on several practical applications (e.g. speech recognition, biological sequence alignment) it has two major limitations: it requires ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Abstract. The standard approach for learning Markov Models with Hidden State uses the ExpectationMaximization framework. While this approach had a significant impact on several practical applications (e.g. speech recognition, biological sequence alignment) it has two major limitations: it requires a known model topology, and learning is only locally optimal. We propose a new PAC framework for learning both the topology and the parameters in partially observable Markov models. Our algorithm learns a Probabilistic Deterministic Finite Automata (PDFA) which approximates a Hidden Markov Model (HMM) up to some desired degree of accuracy. We discuss theoretical conditions under which the algorithm produces an optimal solution (in the PACsense) and demonstrate promising performance on simple dynamical systems. 1
Learning to make predictions in partially observable environments without a generative model
 Journal of Artificial Intelligence Research
, 2011
"... When faced with the problem of learning a model of a highdimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined toget ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
When faced with the problem of learning a model of a highdimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined together to form a more complete, structured model. However, in partially observable (nonMarkov) environments, standard modellearning methods learn generative models, i.e. models that provide a probability distribution over all possible futures (such as POMDPs). It is not straightforward to restrict such models to make only certain predictions, and doing so does not always simplify the learning problem. In this paper we present prediction profile models: nongenerative partial models for partially observable systems that make only a given set of predictions, and are therefore far simpler than generative models in some cases. We formalize the problem of learning a prediction profile model as a transformation of the original modellearning problem, and show empirically that one can learn prediction profile models that make a small set of important predictions even in systems that are too complex for standard generative models. 1.
Constructivist Anticipatory Learning Mechanism (CALM) – dealing with partially deterministic and partially observable environments
 COGNITIVE DEVELOPMENT IN ROBOTIC SYSTEMS. LUND UNIVERSITY COGNITIVE STUDIES, 135.
, 2007
"... This paper presents CALM (Constructivist Anticipatory Learning Mechanism), an agent learning mechanism based on a constructivist approach. It is designed to deal dynamically and interactively with environments which are at the same time partially deterministic and partially observable. We describe i ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper presents CALM (Constructivist Anticipatory Learning Mechanism), an agent learning mechanism based on a constructivist approach. It is designed to deal dynamically and interactively with environments which are at the same time partially deterministic and partially observable. We describe in detail the mechanism, explaining how it represents knowledge, and how the learning methods operate. We analyze the kinds of environmental regularities that CALM can discover, trying to show that our proposition follows the way towards the construction of more abstract or highlevel representational concepts.
A Multiple representation approach to learning dynamical Systems” AAAI Fall Symposium on Representation
 In Proceedings of the 18th European Conference on Machine Learning (ECML07
, 2007
"... We examine the problem of learning a model of a deterministic dynamical systems from experience. A handful of representation schemes have been proposed for capturing such systems, including POMDPs, PSRs, EPSRs, Diversity, and PSTs. We argue that no single representation should be expected to be idea ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We examine the problem of learning a model of a deterministic dynamical systems from experience. A handful of representation schemes have been proposed for capturing such systems, including POMDPs, PSRs, EPSRs, Diversity, and PSTs. We argue that no single representation should be expected to be ideal in all situations and describe an approach for learning the most succinct representation of an unknown dynamical system.
Maintaining Predictions Over Time Without a Model
"... A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Sin ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Since predictive features provide direct information about the agent’s future, they have a number of advantages for control. However, unlike more typical features defined directly over past observations, it is not clear how to maintain the values of predictive features over time. A model could be used, since a model can make any prediction about the future, but in many cases learning a model is infeasible. In this paper we demonstrate that in some cases it is possible to learn to maintain the values of a set of predictive features even when a learning a model is infeasible, and that natural predictive features can be useful for policysearch methods. 1
Learning Algorithms for Automata with Observations
, 2007
"... We consider the problem of learning the behavior of a POMDP (Partially Observable Markov Decision Process) with deterministic actions and observations. This is a challenging problem due to the fact that the observations can only partially identify the states. Recent work by Holmes and Isbell offers ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We consider the problem of learning the behavior of a POMDP (Partially Observable Markov Decision Process) with deterministic actions and observations. This is a challenging problem due to the fact that the observations can only partially identify the states. Recent work by Holmes and Isbell offers an approach for inferring the hidden states from experience in deterministic POMDP environments. We propose an alternative algorithm that ensures more accurate predictions, and we show that in fact it produces the minimal predicting machine. 1
Discovering and Characterizing Hidden Variables
"... Theoretical entities are aspects of the world that cannot be sensed directly but that nevertheless are causally relevant. Scientific inquiry has uncovered many such entities, such as black holes and dark matter. We claim that theoretical entities are important for the development of concepts within ..."
Abstract
 Add to MetaCart
Theoretical entities are aspects of the world that cannot be sensed directly but that nevertheless are causally relevant. Scientific inquiry has uncovered many such entities, such as black holes and dark matter. We claim that theoretical entities are important for the development of concepts within the lifetime of an individual, and present a novel neural network architecture that solves three problems related to theoretical entities: (1) discovering that they exist, (2) determining their number, and (3) computing their values. Experiments show the utility of the proposed approach using a discrete time dynamical system in which some of the state variables are hidden, and sensor data obtained from the camera of a mobile robot in which the sizes and locations of objects in the visual field are observed but their sizes and locations (distances) in the threedimensional world are not.