An asymptotically optimal bandit algorithm for bounded support models (2010)

by Junya Honda, Akimichi Takemura
Venue:In 23rd Conf. on Learning Theory (COLT