Results 1  10
of
405
Treebased batch mode reinforcement learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the socalled Qfunction based on a set of fourtuples (xt,ut,rt,xt+1) where xt denotes the system state a ..."
Abstract

Cited by 224 (42 self)
 Add to MetaCart
Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the socalled Qfunction based on a set of fourtuples (xt,ut,rt,xt+1) where xt denotes the system state
OPTIMAL SAMPLE SELECTION FOR BATCHMODE REINFORCEMENT LEARNING
"... Abstract: We introduce the Optimal Sample Selection (OSS) metaalgorithm for solving discretetime Optimal Control problems. This metaalgorithm maps the problem of finding a nearoptimal closedloop policy to the identification of a small set of onestep system transitions, leading to highquality ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
quality policies when used as input of a batchmode Reinforcement Learning (RL) algorithm. We detail a particular instance of this OSS metaalgorithm that uses treebased Fitted QIteration as a batchmode RL algorithm and Cross Entropy search as a method for navigating efficiently in the space of sample sets
Voronoi model learning for batch mode reinforcement learning
"... We consider deterministic optimal control problems with continuous state spaces where the information on the system dynamics and the reward function is constrained to a set of system transitions. Each system transition gathers a state, the action taken while being in this state, the immediate rewar ..."
Abstract
 Add to MetaCart
reward observed and the next state reached. In such a context, we propose a new model learning–type reinforcement learning (RL) algorithm in batch mode, finitetime and deterministic setting. The algorithm, named Voronoi reinforcement learning (VRL), approximates from a sample of system transitions
Relaxation Schemes for Min Max Generalization in Deterministic Batch Mode Reinforcement Learning
"... We study the minmax optimization problem introduced in [6] for computing policies for batch mode reinforcement learning in a deterministic setting. This problem is NPhard. We focus on the twostage case for which we provide two relaxation schemes. The first relaxation scheme works by dropping some ..."
Abstract
 Add to MetaCart
We study the minmax optimization problem introduced in [6] for computing policies for batch mode reinforcement learning in a deterministic setting. This problem is NPhard. We focus on the twostage case for which we provide two relaxation schemes. The first relaxation scheme works by dropping some
Min max generalization for deterministic batch mode reinforcement learning: Relaxation schemes
 SIAM JOURNAL ON CONTROL AND OPTIMIZATION
, 2013
"... We study the minmax optimization problem introduced in Fonteneau et al. [Towards min max reinforcement learning, ICAART 2010, Springer, Heidelberg, 2011, pp. 61–77] for computing policies for batch mode reinforcement learning in a deterministic setting with fixed, finite time horizon. First, we sh ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
We study the minmax optimization problem introduced in Fonteneau et al. [Towards min max reinforcement learning, ICAART 2010, Springer, Heidelberg, 2011, pp. 61–77] for computing policies for batch mode reinforcement learning in a deterministic setting with fixed, finite time horizon. First, we
Adaptive Treatment of Epilepsy via Batchmode Reinforcement Learning
"... This paper highlights the crucial role that modern machine learning techniques can play in the optimization of treatment strategies for patients with chronic disorders. In particular, we focus on the task of optimizing a deepbrain stimulation strategy for the treatment of epilepsy. The challenge is ..."
Abstract

Cited by 27 (5 self)
 Add to MetaCart
is to choose which stimulation action to apply, as a function of the observed EEG signal, so as to minimize the frequency and duration of seizures. We apply recent techniques from the reinforcement learning literature—namely fitted Qiteration and extremely randomized trees—to learn an optimal stimulation
Batch Mode Reinforcement Learning based on the Synthesis of Artificial Trajectories
 ANN OPER RES
, 2012
"... ..."
1On Periodic Reference Tracking Using BatchMode Reinforcement Learning with Application to Gene Regulatory Network Control
"... Abstract—In this paper, we consider the periodic reference tracking problem in the framework of batchmode reinforcement learning, which studies methods for solving optimal control problems from the sole knowledge of a set of trajectories. In particular, we extend an existing batchmode reinforcemen ..."
Abstract
 Add to MetaCart
Abstract—In this paper, we consider the periodic reference tracking problem in the framework of batchmode reinforcement learning, which studies methods for solving optimal control problems from the sole knowledge of a set of trajectories. In particular, we extend an existing batchmode
On Periodic Reference Tracking Using BatchMode Reinforcement Learning with Application to Gene Regulatory Network Control
"... AbstractIn this paper, we consider the periodic reference tracking problem in the framework of batchmode reinforcement learning, which studies methods for solving optimal control problems from the sole knowledge of a set of trajectories. In particular, we extend an existing batchmode reinforceme ..."
Abstract
 Add to MetaCart
AbstractIn this paper, we consider the periodic reference tracking problem in the framework of batchmode reinforcement learning, which studies methods for solving optimal control problems from the sole knowledge of a set of trajectories. In particular, we extend an existing batchmode
– Experimental Illustration Conclusions Batch Mode Reinforcement LearningReinforcement Learning Environment Agent
, 2012
"... Reinforcement Learning (RL) aims at finding a policy maximizing received rewards by interacting with the environmentBatch Mode Reinforcement Learning All the available information is contained in a batch collection of data Batch mode RL aims at computing a (near)optimal policy from this collection ..."
Abstract
 Add to MetaCart
Reinforcement Learning (RL) aims at finding a policy maximizing received rewards by interacting with the environmentBatch Mode Reinforcement Learning All the available information is contained in a batch collection of data Batch mode RL aims at computing a (near)optimal policy from this collection
Results 1  10
of
405