DMCA

Deterministic Better Rates for Any Adversarial Deterministic MDP MDPs with adversarial rewards and bandit feedback. (2012)

by Ofer Dekel , Elad Hazan
Venue:In Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence,
Citations:1 - 1 self