Results 1  10
of
115
Maximum entropy inverse reinforcement learning
 In Proc. AAAI
, 2008
"... Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recovering a utility function that makes the behavior induced by a nearoptimal policy closely mimic demonstrated behavior. In th ..."
Abstract

Cited by 112 (22 self)
 Add to MetaCart
. In this work, we develop a probabilistic approach based on the principle of maximum entropy. Our approach provides a welldefined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods. We develop our technique in the context of modeling
MaximumEntropy Inverse ReinforcementLearning
"... Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recovering a utility function that makes the behavior induced by a nearoptimal policy closely mimic demonstrated behavior. In th ..."
Abstract
 Add to MetaCart
Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recovering a utility function that makes the behavior induced by a nearoptimal policy closely mimic demonstrated behavior
A unifying framework for computational reinforcement learning theory
, 2009
"... Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervisedlearning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understand ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
in understanding the nature of supervised learning, they have not been as successful in reinforcement learning (RL). Here, the fundamental barrier is the need for active exploration in sequential decision problems. An RL agent tries to maximize longterm utility by exploiting its knowledge about the problem
Published In Maximum Entropy Inverse Reinforcement Learning
, 2008
"... Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recovering a utility function that makes the behavior induced by a nearoptimal policy closely mimic demonstrated behavior. In ..."
Abstract
 Add to MetaCart
ior. In this work, we develop a probabilistic approach based on the principle of maximum entropy. Our approach provides a welldefined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods. We develop our technique in the context of modeling
Infinite Time Horizon Maximum Causal Entropy Inverse Reinforcement Learning
"... Abstract—We extend the maximum causal entropy framework for inverse reinforcement learning to the infinite time horizon discounted reward setting. To do so, we maximize discounted future contributions to causal entropy subject to a discounted feature expectation matching constraint. A parameterize ..."
Abstract
 Add to MetaCart
Abstract—We extend the maximum causal entropy framework for inverse reinforcement learning to the infinite time horizon discounted reward setting. To do so, we maximize discounted future contributions to causal entropy subject to a discounted feature expectation matching constraint. A pa
Efficient Reinforcement Learning
 In Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory
, 1994
"... In this paper we propose a new formal model for studying reinforcement learning, based on Valiant's PAC framework. In our model the learner does not have direct access to every state of the environment. Instead, every sequence of experiments starts in a fixed initial state and the learner is pr ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
In this paper we propose a new formal model for studying reinforcement learning, based on Valiant's PAC framework. In our model the learner does not have direct access to every state of the environment. Instead, every sequence of experiments starts in a fixed initial state and the learner
Training Parsers by Inverse Reinforcement Learning
 MACHINE LEARNING
, 2009
"... One major idea in structured prediction is to assume that the predictor computes its output by finding the maximum of a score function. The training of such a predictor can then be cast as the problem of finding weights of the score function so that the output of the predictor on the inputs matche ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
matches the corresponding structured labels on the training set. A similar problem is studied in inverse reinforcement learning (IRL) where one is given an environment and a set of trajectories and the problem is to find a reward function such that an agent acting optimally with respect to the reward
A general framework for reinforcement learning
 In Proceedings of ICANN'95
, 1995
"... Abstract: In this artide we p1'Opose a geneml framework for sequential dedsion making. The fi'amework is baSf.d on the observation that the. derication of the optimal behaviour ttnde.T various decision criteria follows the. same patte!"n: the cost of policies can be decomposed into th ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract: In this artide we p1'Opose a geneml framework for sequential dedsion making. The fi'amework is baSf.d on the observation that the. derication of the optimal behaviour ttnde.T various decision criteria follows the. same patte!"n: the cost of policies can be decomposed
Maximum Entropy Inverse Reinforcement Learning in Continuous State Spaces with Path Integrals
"... Inverse reinforcement learning (IRL) is the problem of recovering a cost function that is consistent with observations of optimal or “expert ” trajectories and with a given dynamic model (Ng & Russell, 2000). In ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Inverse reinforcement learning (IRL) is the problem of recovering a cost function that is consistent with observations of optimal or “expert ” trajectories and with a given dynamic model (Ng & Russell, 2000). In
Compositional Models for Reinforcement Learning
"... Abstract. Innovations such as optimistic exploration, function approximation, and hierarchical decomposition have helped scale reinforcement learning to more complex environments, but these three ideas have rarely been studied together. This paper develops a unified framework that formalizes these a ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract. Innovations such as optimistic exploration, function approximation, and hierarchical decomposition have helped scale reinforcement learning to more complex environments, but these three ideas have rarely been studied together. This paper develops a unified framework that formalizes
Results 1  10
of
115