Learning to act using real-time dynamic programming (1993)

by Andrew G. Barto , Steven J. Bradtke , Satinder P. Singh
Venue:
Citations:533 - 18 self