## A unified analysis of value-function-based reinforcementlearning algorithms. Neural Computation (1997)

Citations: | 32 - 7 self |

### BibTeX

@MISC{Szepesvari97aunified,

author = {Csaba Szepesvari and Michael L. Littman},

title = {A unified analysis of value-function-based reinforcementlearning algorithms. Neural Computation},

year = {1997}

}

### Years of Citing Articles

### OpenURL

### Abstract

Reinforcement learning is the problem of generating optimal behavior in a sequential decision-ma.king environment given the opportunity of interacting,vith it. Many algorithms for solving reinforcement-learning problems work by computing improved estimates of the optimal value function. \Ve extend prior analyses of reinforcement-learning algorithms and present a powerful new theorem that can provide a unified analysis of value-function-based reinforcement-learning algorithms. The usefulness of the theorem lies in how it allows the convergence of a complex asynchronous reinforcement-learning algorithm to be proven by verifying that a Himplcr HynchronouH algorithm convergeH. \-Ve illuHtrate the application of the theorem by analyzing the convergence of Q-learningl model-based reinforcement learning, Q-learning with multi-state updates, Q-learning for:\farkov games, and risk-sensitive reinforcement learning. 1