Learning and sequential decision making (1989)

by A G Barto, R S Sutton, C J C H Watkins