An analysis of temporal-difference learning with function approximation (1997)

by J N Tsitsiklis, B Van Roy
Venue:IEEE Transactions on Automatic Control