Value function approximation in average reward reinforcement learning (0)

by S Mahadevan, L Baird
Venue:In preparation