Average reward reinforcement learning: Foundations, algorithms, and empirical results (1996)

by S Mahadevan
Venue:Machine Learning, 22(1–3), 159–195. 123 personal copy Auton Agent Multi-Agent Syst