DMCA

Online Markov decision processes under bandit feedback. (2010)

by Gergely Neu , Csaba Szepesvári , András Antos
Venue:In Advances in Neural Information Processing Systems 23: 2010.,
Citations:18 - 6 self