A generalization error for Q-Learning (2005)
| Venue: | JOURNAL OF MACHINE LEARNING RESEARCH |
| Citations: | 13 - 5 self |
BibTeX
@ARTICLE{Murphy05ageneralization,
author = {Susan A. Murphy},
title = {A generalization error for Q-Learning},
journal = {JOURNAL OF MACHINE LEARNING RESEARCH},
year = {2005},
volume = {6},
pages = {1073--1097}
}
OpenURL
Abstract
Planning problems that involve learning a policy from a single training set of finite horizon trajectories arise in both social science and medical fields. We consider Q-learning with function approximation for this setting and derive an upper bound on the generalization error. This upper bound is in terms of quantities minimized by a Q-learning algorithm, the complexity of the approximation space and an approximation term due to the mismatch between Q-learning and the goal of learning a policy that maximizes the value function.







