Gradient descent for general reinforcement learning (1999)

by L Baird, A W Moore
Venue:Advances in Neural Information Processing Systems