Results 1  10
of
22
Learning partially observable deterministic action models
 In Proc. Nineteenth International Joint Conference on Artificial Intelligence (IJCAI ’05
, 2005
"... We present exact algorithms for identifying deterministicactions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenari ..."
Abstract

Cited by 55 (2 self)
 Add to MetaCart
We present exact algorithms for identifying deterministicactions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AIplanning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventuregame playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis. 1.
Exponential family predictive representations of state
 In Neural Information Processing Systems (NIPS
"... 2008 To my wife, Martha. ii Acknowledgments This work would not have been possible without generous help, both intellectually and financially. I am grateful to my advisor, Satinder Singh, for the long discussions we have had as he has patiently taught me to think clearly through my own ideas, sharpe ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
2008 To my wife, Martha. ii Acknowledgments This work would not have been possible without generous help, both intellectually and financially. I am grateful to my advisor, Satinder Singh, for the long discussions we have had as he has patiently taught me to think clearly through my own ideas, sharpen my writing, and to raise my sights. A special thanks also to my lab mates, Matt Rudary, Britton Wolfe, Vishal Soni, Erik Talviti, Jonathan Sorg and Ishan Chaudhuri for always letting me bounce ideas around, for listening, and for patient tutoring. Thanks to Andrew Nuxoll for being a kindred spirit, to Nick Gorski for the occasional foosball game and to my collaborators at the University of Alberta. Finally, I would like to gratefully acknowledge the National Science Foundation for financially supporting me through most of my studies with a Graduate Research Fellowship. Finally, a special thank you to my wife Martha for her love, her constancy, her feistiness and for always keeping me on the straight and narrow. Thank you, Grace, Peterson and Andrew for reminding
PACLearning of Markov Models with Hidden State
 In Proceedings of ECML
, 2006
"... Abstract. The standard approach for learning Markov Models with Hidden State uses the ExpectationMaximization framework. While this approach had a significant impact on several practical applications (e.g. speech recognition, biological sequence alignment) it has two major limitations: it requires ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
Abstract. The standard approach for learning Markov Models with Hidden State uses the ExpectationMaximization framework. While this approach had a significant impact on several practical applications (e.g. speech recognition, biological sequence alignment) it has two major limitations: it requires a known model topology, and learning is only locally optimal. We propose a new PAC framework for learning both the topology and the parameters in partially observable Markov models. Our algorithm learns a Probabilistic Deterministic Finite Automata (PDFA) which approximates a Hidden Markov Model (HMM) up to some desired degree of accuracy. We discuss theoretical conditions under which the algorithm produces an optimal solution (in the PACsense) and demonstrate promising performance on simple dynamical systems. 1
Constructivist Anticipatory Learning Mechanism (CALM) – dealing with partially deterministic and partially observable environments
 COGNITIVE DEVELOPMENT IN ROBOTIC SYSTEMS. LUND UNIVERSITY COGNITIVE STUDIES, 135.
, 2007
"... This paper presents CALM (Constructivist Anticipatory Learning Mechanism), an agent learning mechanism based on a constructivist approach. It is designed to deal dynamically and interactively with environments which are at the same time partially deterministic and partially observable. We describe i ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
This paper presents CALM (Constructivist Anticipatory Learning Mechanism), an agent learning mechanism based on a constructivist approach. It is designed to deal dynamically and interactively with environments which are at the same time partially deterministic and partially observable. We describe in detail the mechanism, explaining how it represents knowledge, and how the learning methods operate. We analyze the kinds of environmental regularities that CALM can discover, trying to show that our proposition follows the way towards the construction of more abstract or highlevel representational concepts.
Learning to make predictions in partially observable environments without a generative model
 Journal of Artificial Intelligence Research
, 2011
"... When faced with the problem of learning a model of a highdimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined toget ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
When faced with the problem of learning a model of a highdimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined together to form a more complete, structured model. However, in partially observable (nonMarkov) environments, standard modellearning methods learn generative models, i.e. models that provide a probability distribution over all possible futures (such as POMDPs). It is not straightforward to restrict such models to make only certain predictions, and doing so does not always simplify the learning problem. In this paper we present prediction profile models: nongenerative partial models for partially observable systems that make only a given set of predictions, and are therefore far simpler than generative models in some cases. We formalize the problem of learning a prediction profile model as a transformation of the original modellearning problem, and show empirically that one can learn prediction profile models that make a small set of important predictions even in systems that are too complex for standard generative models. 1.
Skip Context Tree Switching
"... Context Tree Weighting is a powerful probabilistic sequence prediction technique that efficiently performs Bayesian model averaging over the class of all prediction suffix trees of bounded depth. In this paper we show how to generalize this technique to the class of Kskip prediction suffix trees. ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Context Tree Weighting is a powerful probabilistic sequence prediction technique that efficiently performs Bayesian model averaging over the class of all prediction suffix trees of bounded depth. In this paper we show how to generalize this technique to the class of Kskip prediction suffix trees. Contrary to regular prediction suffix trees,Kskip prediction suffix trees are permitted to ignore up toK contiguous portions of the context. This allows for significant improvements in predictive accuracy when irrelevant variables are present, a case which often occurs within recordaligned data and images. We provide a regretbased analysis of our approach, and empirically evaluate it on the Calgary corpus and a set of Atari 2600 screen prediction tasks. 1.
Maintaining Predictions Over Time Without a Model
"... A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Sin ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Since predictive features provide direct information about the agent’s future, they have a number of advantages for control. However, unlike more typical features defined directly over past observations, it is not clear how to maintain the values of predictive features over time. A model could be used, since a model can make any prediction about the future, but in many cases learning a model is infeasible. In this paper we demonstrate that in some cases it is possible to learn to maintain the values of a set of predictive features even when a learning a model is infeasible, and that natural predictive features can be useful for policysearch methods. 1
On Learning with Imperfect Representations
"... Abstract—In this paper we present a perspective on the relationship between learning and representation in sequential decision making tasks. We undertake a brief survey of existing realworld applications, which demonstrates that the classical “tabular ” representation seldom applies in practice. Sp ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper we present a perspective on the relationship between learning and representation in sequential decision making tasks. We undertake a brief survey of existing realworld applications, which demonstrates that the classical “tabular ” representation seldom applies in practice. Specifically, several practical tasks suffer from state aliasing, and most demand some form of generalization and function approximation. Coping with these representational aspects thus becomes an important direction for furthering the advent of reinforcement learning in practice. The central thesis we present in this position paper is that in practice, learning methods specifically developed to work with imperfect representations are likely to perform better than those developed for perfect representations and then applied in imperfectrepresentation settings. We specify an evaluation criterion for learning methods in practice, and propose a framework for their synthesis. In particular, we highlight the degrees of “representational bias ” prevalent in different learning methods. We reference a variety of relevant literature as a background for this introspective essay. I.
A Multiple representation approach to learning dynamical Systems” AAAI Fall Symposium on Representation
 In Proceedings of the 18th European Conference on Machine Learning (ECML07
, 2007
"... We examine the problem of learning a model of a deterministic dynamical systems from experience. A handful of representation schemes have been proposed for capturing such systems, including POMDPs, PSRs, EPSRs, Diversity, and PSTs. We argue that no single representation should be expected to be idea ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
We examine the problem of learning a model of a deterministic dynamical systems from experience. A handful of representation schemes have been proposed for capturing such systems, including POMDPs, PSRs, EPSRs, Diversity, and PSTs. We argue that no single representation should be expected to be ideal in all situations and describe an approach for learning the most succinct representation of an unknown dynamical system.