On the Convergence of Temporal-Difference Learning with Linear Function Approximation (2001)

by V Tadic