Temporal difference learning and TD-Gammon (1995)

by G J Tesauro
Venue:Communications of the ACM