DMCA
Deterministic Better Rates for Any Adversarial Deterministic MDP MDPs with adversarial rewards and bandit feedback. (2012)
by
Ofer Dekel
,
Elad Hazan
Venue: | In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence, |
Citations: | 1 - 1 self |