Reinforcement learning is direct adaptive optimal control (1992)

by R S Sutton, A G Barto, R J Williams
Venue:IEEE Control Systems