On the convergence of temporal-difference learning with linear function approximation (2001)

by V Tadić
Venue:In Machine Learning