DMCA

Off-policy temporal-difference learning with function approximation (2001)

by Doina Precup , Richard S. Sutton , Sanjoy Dasgupta
Venue:Proceedings of the 18th International Conference on Machine Learning
Citations:62 - 12 self