• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 10,408
Next 10 →

Continuous-Time Markov Decision Processes: Theory, Approximations and Applications

by Alexey Piunovskiy, Yi Zhang , 2010
"... ar ..."
Abstract - Cited by 15 (3 self) - Add to MetaCart
Abstract not found

New Grid-Based Algorithms for Partially Observable Markov Decision Processes: Theory and Practice

by Blai Bonet
"... We present two new algorithms for Partially Observable Markov Decision Processes (pomdps). The first algorithm is a general grid-based algorithm for pomdps with theoretical optimality guarantees. The other algorithm is for the subclass of problems known as Stochastic Shortest-Path problems in belief ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
We present two new algorithms for Partially Observable Markov Decision Processes (pomdps). The first algorithm is a general grid-based algorithm for pomdps with theoretical optimality guarantees. The other algorithm is for the subclass of problems known as Stochastic Shortest-Path problems

The Infinite Hidden Markov Model

by Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen - Machine Learning , 2002
"... We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. Th ..."
Abstract - Cited by 637 (41 self) - Add to MetaCart
We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data

Decision-Theoretic Planning: Structural Assumptions and Computational Leverage

by Craig Boutilier, Thomas Dean, Steve Hanks - JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH , 1999
"... Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives ..."
Abstract - Cited by 515 (4 self) - Add to MetaCart
and perspectives adopted in these areas often differ in substantial ways, many planning problems of interest to researchers in these fields can be modeled as Markov decision processes (MDPs) and analyzed using the techniques of decision theory. This paper presents an overview and synthesis of MDP

Markov games as a framework for multi-agent reinforcement learning

by Michael L. Littman - IN PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING , 1994
"... In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed in their behavior ..."
Abstract - Cited by 601 (13 self) - Add to MetaCart
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed

The Complexity of Decentralized Control of Markov Decision Processes

by Daniel S. Bernstein, Robert Givan, Neil Immerman, Shlomo Zilberstein - Mathematics of Operations Research , 2000
"... We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalizations of both the fullyobservable case and the partially-observable case that allow for decentralized control are described. ..."
Abstract - Cited by 411 (46 self) - Add to MetaCart
We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalizations of both the fullyobservable case and the partially-observable case that allow for decentralized control are described

A theory of memory retrieval

by Roger Ratcliff - PSYCHOL. REV , 1978
"... A theory of memory retrieval is developed and is shown to apply over a range of experimental paradigms. Access to memory traces is viewed in terms of a resonance metaphor. The probe item evokes the search set on the basis of probe-memory item relatedness, just as a ringing tuning fork evokes sympath ..."
Abstract - Cited by 769 (83 self) - Add to MetaCart
sympathetic vibrations in other tuning forks. Evidence is accumulated in parallel from each probe-memory item comparison, and each comparison is modeled by a continuous random walk process. In item recognition, the decision process is self-terminating on matching comparisons and exhaustive on nonmatching

Planning and acting in partially observable stochastic domains

by Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra - ARTIFICIAL INTELLIGENCE , 1998
"... In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm ..."
Abstract - Cited by 1095 (38 self) - Add to MetaCart
In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm

Reinforcement learning: a survey

by Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - Journal of Artificial Intelligence Research , 1996
"... This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract - Cited by 1714 (25 self) - Add to MetaCart
of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning

by Richard S. Sutton , Doina Precup , Satinder Singh , 1999
"... Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We exte ..."
Abstract - Cited by 569 (38 self) - Add to MetaCart
Learning, planning, and representing knowledge at multiple levels of temporal abstraction are key, longstanding challenges for AI. In this paper we consider how these challenges can be addressed within the mathematical framework of reinforcement learning and Markov decision processes (MDPs). We
Next 10 →
Results 1 - 10 of 10,408
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University