Real-time learning and control using asynchronous dynamic programming (1991)

by A G Barto, S J Bradtke, S P Singh