A convergent form of approximate policy iteration (2003)

by T J Perkins, D Precup
Venue:In Advances in Neural Information Processing Systems