On-line policy improvement using monte-carlo search (1996)

by G Tesauro, G R Galperin
Venue:In 9th Conference on Advances in Neural Information Processing