• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Looping Suffix TreeBased Inference of Partially Observable Hidden State (2006)

by M Holmes, C Isbell
Venue:Proc. 23 th ICML
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

Learning partially observable deterministic action models

by Eyal Amir - In Proc. Nineteenth International Joint Conference on Artificial Intelligence (IJCAI ’05 , 2005
"... We present exact algorithms for identifying deterministic-actions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenari ..."
Abstract - Cited by 21 (0 self) - Add to MetaCart
We present exact algorithms for identifying deterministic-actions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis. 1.

Exponential family predictive representations of state

by David Wingate - In Neural Information Processing Systems (NIPS
"... 2008 To my wife, Martha. ii Acknowledgments This work would not have been possible without generous help, both intellectually and financially. I am grateful to my advisor, Satinder Singh, for the long discussions we have had as he has patiently taught me to think clearly through my own ideas, sharpe ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
2008 To my wife, Martha. ii Acknowledgments This work would not have been possible without generous help, both intellectually and financially. I am grateful to my advisor, Satinder Singh, for the long discussions we have had as he has patiently taught me to think clearly through my own ideas, sharpen my writing, and to raise my sights. A special thanks also to my lab mates, Matt Rudary, Britton Wolfe, Vishal Soni, Erik Talviti, Jonathan Sorg and Ishan Chaudhuri for always letting me bounce ideas around, for listening, and for patient tutoring. Thanks to Andrew Nuxoll for being a kindred spirit, to Nick Gorski for the occasional foosball game and to my collaborators at the University of Alberta. Finally, I would like to gratefully acknowledge the National Science Foundation for financially supporting me through most of my studies with a Graduate Research Fellowship. Finally, a special thank you to my wife Martha for her love, her constancy, her feistiness and for always keeping me on the straight and narrow. Thank you, Grace, Peterson and Andrew for reminding

PAC-Learning of Markov Models with Hidden State

by Ricard Gavaldà, Joelle Pineau, Doina Precup - In Proceedings of ECML , 2006
"... Abstract. The standard approach for learning Markov Models with Hidden State uses the Expectation-Maximization framework. While this approach had a significant impact on several practical applications (e.g. speech recognition, biological sequence alignment) it has two major limitations: it requires ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
Abstract. The standard approach for learning Markov Models with Hidden State uses the Expectation-Maximization framework. While this approach had a significant impact on several practical applications (e.g. speech recognition, biological sequence alignment) it has two major limitations: it requires a known model topology, and learning is only locally optimal. We propose a new PAC framework for learning both the topology and the parameters in partially observable Markov models. Our algorithm learns a Probabilistic Deterministic Finite Automata (PDFA) which approximates a Hidden Markov Model (HMM) up to some desired degree of accuracy. We discuss theoretical conditions under which the algorithm produces an optimal solution (in the PAC-sense) and demonstrate promising performance on simple dynamical systems. 1

Maintaining Predictions Over Time Without a Model

by Erik Talvitie, Satinder Singh
"... A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Sin ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
A common approach to the control problem in partially observable environments is to perform a direct search in policy space, as defined over some set of features of history. In this paper we consider predictive features, whose values are conditional probabilities of future events, given history. Since predictive features provide direct information about the agent’s future, they have a number of advantages for control. However, unlike more typical features defined directly over past observations, it is not clear how to maintain the values of predictive features over time. A model could be used, since a model can make any prediction about the future, but in many cases learning a model is infeasible. In this paper we demonstrate that in some cases it is possible to learn to maintain the values of a set of predictive features even when a learning a model is infeasible, and that natural predictive features can be useful for policy-search methods. 1

Approved as to style and content by:

by Alicia Peregrin Wolfe, Sridhar Mahadevan Member, Leslie Kaelbling Member, Bruce Turkington Member, Andrew G. Barto, Department Chair, Alicia Peregrin Wolfe , 2010
"... Computer ScienceTo my mother, Mary Anne Schweitzer, for her time and patience. ACKNOWLEDGMENTS Thanks firstly to my committee, in particular for bearing with me through several schedule changes. Also to the members of the Autonomous Learning Laboratory for many interesting discussions, including but ..."
Abstract - Add to MetaCart
Computer ScienceTo my mother, Mary Anne Schweitzer, for her time and patience. ACKNOWLEDGMENTS Thanks firstly to my committee, in particular for bearing with me through several schedule changes. Also to the members of the Autonomous Learning Laboratory for many interesting discussions, including but not limited to Ozgur Simsek, Amy McGovern, Balaraman Ravindran, Sarah Osentoski and Ashvin Shah. Other members of the UMass Computer Science community I’ve enjoyed many long discussions with include Victoria Manfredi, Jen Neville, Lisa Friedland, Emily Horrell and TJ Brunette. Prof. David Jensen, while not on the committee for my dissertation, was a helpful mentor and collaborator on earlier projects. Supportive friends and family include: Martin Walkow, providing the linguist’s perspective; my sister Rachel Wolfe who can always make me see the humor in any situation; my father John Wolfe; who taught me to always question, question, question; and my mother Mary Anne Schweitzer, who, in addition to probably hundreds of long phone calls pitched in at the last minute to transport my shoes into town from Connecticut. Also thanks to the many helpful staff members in the department, including but not

On Learning with Imperfect Representations

by Shivaram Kalyanakrishnan, Peter Stone
"... Abstract—In this paper we present a perspective on the relationship between learning and representation in sequential decision making tasks. We undertake a brief survey of existing real-world applications, which demonstrates that the classical “tabular ” representation seldom applies in practice. Sp ..."
Abstract - Add to MetaCart
Abstract—In this paper we present a perspective on the relationship between learning and representation in sequential decision making tasks. We undertake a brief survey of existing real-world applications, which demonstrates that the classical “tabular ” representation seldom applies in practice. Specifically, several practical tasks suffer from state aliasing, and most demand some form of generalization and function approximation. Coping with these representational aspects thus becomes an important direction for furthering the advent of reinforcement learning in practice. The central thesis we present in this position paper is that in practice, learning methods specifically developed to work with imperfect representations are likely to perform better than those developed for perfect representations and then applied in imperfect-representation settings. We specify an evaluation criterion for learning methods in practice, and propose a framework for their synthesis. In particular, we highlight the degrees of “representational bias ” prevalent in different learning methods. We reference a variety of relevant literature as a background for this introspective essay. I.

Discovering and Characterizing Hidden Variables

by Soumi Ray, Tim Oates
"... Theoretical entities are aspects of the world that cannot be sensed directly but that nevertheless are causally relevant. Scientific inquiry has uncovered many such entities, such as black holes and dark matter. We claim that theoretical entities are important for the development of concepts within ..."
Abstract - Add to MetaCart
Theoretical entities are aspects of the world that cannot be sensed directly but that nevertheless are causally relevant. Scientific inquiry has uncovered many such entities, such as black holes and dark matter. We claim that theoretical entities are important for the development of concepts within the lifetime of an individual, and present a novel neural network architecture that solves three problems related to theoretical entities: (1) discovering that they exist, (2) determining their number, and (3) computing their values. Experiments show the utility of the proposed approach using a discrete time dynamical system in which some of the state variables are hidden, and sensor data obtained from the camera of a mobile robot in which the sizes and locations of objects in the visual field are observed but their sizes and locations (distances) in the three-dimensional world are not.

Learning Methods for Sequential Decision Making with Imperfect Representations

by Shivaram Kalyanakrishnan , 2011
"... ..."
Abstract - Add to MetaCart
Abstract not found
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University