Results 1 
2 of
2
Approximate Dynamic Programming using Fluid and Diffusion Approximations with Applications to Power Management
"... Abstract—TD learning and its refinements are powerful tools for approximating the solution to dynamic programming problems. However, the techniques provide the approximate solution only within a prescribed finitedimensional function class. Thus, the question that always arises is how should the fun ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
(Show Context)
Abstract—TD learning and its refinements are powerful tools for approximating the solution to dynamic programming problems. However, the techniques provide the approximate solution only within a prescribed finitedimensional function class. Thus, the question that always arises is how should the function class be chosen? The goal of this paper is to propose an approach for TD learning based on choosing the function class using the solutions to associated fluid and diffusion approximations. In order to illustrate this new approach, the paper focuses on an application to dynamic speed scaling for power management.
TDLearning with Exploration
"... Abstract — We introduce exploration in the TDlearning algorithm to approximate the value function for a given policy. In this way we can modify the norm used for approximation, “zooming in ” to a region of interest in the state space. We also provide extensions to SARSA to eliminate the need for n ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — We introduce exploration in the TDlearning algorithm to approximate the value function for a given policy. In this way we can modify the norm used for approximation, “zooming in ” to a region of interest in the state space. We also provide extensions to SARSA to eliminate the need for numerical integration in policy improvement. Construction of the algorithm and its analysis build on recent general results concerning the spectral theory of Markov chains and positive operators. I.