Learning from Delayed Reinforcement in a Complex Domain (1991)

by D Chapman, L P Kaelbling
Venue:Proc. of the IJCAI