Results 1  10
of
2,215
Online Planning for Large MDPs with MAXQ Decomposition
"... Markov decision processes (MDPs) provide an expressive framework for planning in stochastic domains. However, exactly solving a large MDP is often intractable due to the curse of dimensionality. Online algorithms help overcome the high computational complexity by avoiding computing a policy for each ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
for each possible state. Hierarchical decomposition is another promising way to help scale MDP algorithms up to large domains by exploiting their underlying structure. In this paper, we present an effort on combining the benefits of a general hierarchical structure based on MAXQ value function
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
 Journal of Artificial Intelligence Research
, 2000
"... This paper presents a new approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs. Th ..."
Abstract

Cited by 439 (6 self)
 Add to MetaCart
. The decomposition, known as the MAXQ decomposition, has both a procedural semanticsas a subroutine hierarchyand a declarative semanticsas a representation of the value function of a hierarchical policy. MAXQ unifies and extends previous work on hierarchical reinforcement learning by Singh, Kaelbling
The MAXQ Method for Hierarchical Reinforcement Learning
 In Proceedings of the Fifteenth International Conference on Machine Learning
, 1998
"... This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural semanticsas a subroutine hierarchyand a declarative semanticsas a representation of the value function of a hierarchi ..."
Abstract

Cited by 146 (5 self)
 Add to MetaCart
This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural semanticsas a subroutine hierarchyand a declarative semanticsas a representation of the value function of a
Automatic Induction of MAXQ Hierarchies
"... Scaling up reinforcement learning to large domains requires leveraging the structure in the domain. Hierarchical reinforcement learning has been one of the ways in which the domain structure is exploited to constrain the value function space of the learner, and speed up learning[10, 3, 1]. In the MA ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
]. In the MAXQ framework, for example, a task hierarchy is defined, and a set of relevant features to represent the completion function for each tasksubtask pair are given [3], resulting in decomposed subtaskspecific value functions that are easier to learn than the global value function. The MAXQ
Global Optimization with Polynomials and the Problem of Moments
 SIAM Journal on Optimization
, 2001
"... We consider the problem of finding the unconstrained global minimum of a realvalued polynomial p(x) : R R, as well as the global minimum of p(x), in a compact set K defined by polynomial inequalities. It is shown that this problem reduces to solving an (often finite) sequence of convex linear mat ..."
Abstract

Cited by 569 (47 self)
 Add to MetaCart
We consider the problem of finding the unconstrained global minimum of a realvalued polynomial p(x) : R R, as well as the global minimum of p(x), in a compact set K defined by polynomial inequalities. It is shown that this problem reduces to solving an (often finite) sequence of convex linear matrix inequality (LMI) problems. A notion of KarushKuhnTucker polynomials is introduced in a global optimality condition. Some illustrative examples are provided. Key words. global optimization, theory of moments and positive polynomials, semidefinite programming AMS subject classifications. 90C22, 90C25 PII. S1052623400366802 1.
An Overview of MAXQ Hierarchical Reinforcement Learning
 IN ABSTRACTION, REFORMULATION, AND APPROXIMATION
, 2000
"... . Reinforcement learning addresses the problem of learning optimal policies for sequential decisionmaking problems involving stochastic operators and numerical reward functions rather than the more traditional deterministic operators and logical goal predicates. In many ways, reinforcement lear ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
. This paper gives an overview of the MAXQ value function decomposition and its support for state abstraction and action abstraction. 1 Introduction Reinforcement learning studies the problem of a learning agent that interacts with an unknown, stochastic, but fullyobservable environment. This problem can
2.3 MAXQ Value Function Decomposition...................... 9
, 2005
"... In this paper two approaches to hierarchical reinforcement learning are applied to a complex gridworld navigation problem. The first method is an adaption of Feudal Reinforcement Learning by Dayan and Hinton, and the other is a novel method called the State Variable Combination approach (SVC), desi ..."
Abstract
 Add to MetaCart
In this paper two approaches to hierarchical reinforcement learning are applied to a complex gridworld navigation problem. The first method is an adaption of Feudal Reinforcement Learning by Dayan and Hinton, and the other is a novel method called the State Variable Combination approach (SVC), designed for a problem consisting of multiple conflicting subproblems. Feudal Reinforcement Learning was not easily adaptable to the gridworld navigation problem, and proved inefficient. SVC proved successful for most cases, but was erratic in its performance. 1
State Abstraction in MAXQ Hierarchical Reinforcement Learning
 Advances in Neural Information Processing Systems 12
, 2000
"... Many researchers have explored methods for hierarchical reinforcement learning (RL) with temporal abstractions, in which abstract actions are defined that can perform many primitive actions before terminating. However, little is known about learning with state abstractions, in which aspects of the s ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
of the state space are ignored. In previous work, we developed the MAXQ method for hierarchical RL. In this paper, we define five conditions under which state abstraction can be combined with the MAXQ value function decomposition. We prove that the MAXQQ learning algorithm converges under these conditions
Results 1  10
of
2,215