Learning to predict by the methods of temporal differences (1988)

by Richard S. Sutton
Venue:MACHINE LEARNING