Optimistic linear programming gives logarithmic regret for irreducible MDPs (2007)

by Ambuj Tewari , Peter L. Bartlett
Venue:In Proceedings of Neural Information Processing Systems Conference (NIPS
Citations:13 - 0 self