Learning to predict by the method of temporal di!erences (1988)

by R S Sutton
Venue:Machine Learning