DMCA

Online regret bounds for undiscounted continuous reinforcement learning (2012)

by Ronald Ortner , Daniil Ryabko
Venue:In Advances in Neural Information Processing Systems NIPS
Citations:11 - 4 self