Results 1 
2 of
2
Tracking the best expert
 In Proceedings of the 12th International Conference on Machine Learning
, 1995
"... Abstract. We generalize the recent relative loss bounds for online algorithms where the additional loss of the algorithm on the whole sequence of examples over the loss of the best expert is bounded. The generalization allows the sequence to be partitioned into segments, and the goal is to bound th ..."
Abstract

Cited by 246 (20 self)
 Add to MetaCart
(Show Context)
Abstract. We generalize the recent relative loss bounds for online algorithms where the additional loss of the algorithm on the whole sequence of examples over the loss of the best expert is bounded. The generalization allows the sequence to be partitioned into segments, and the goal is to bound the additional loss of the algorithm over the sum of the losses of the best experts for each segment. This is to model situations in which the examples change and different experts are best for certain segments of the sequence of examples. In the single segment case, the additional loss is proportional to log n, where n is the number of experts and the constant of proportionality depends on the loss function. Our algorithms do not produce the best partition; however the loss bound shows that our predictions are close to those of the best partition. When the number of segments is k +1and the sequence is of length ℓ, we can bound the additional loss of our algorithm over the best partition by O(k log n + k log(ℓ/k)). For the case when the loss per trial is bounded by one, we obtain an algorithm whose additional loss over the loss of the best partition is independent of the length of the sequence. The additional loss becomes O(k log n + k log(L/k)), where L is the loss of the best partition with k +1segments. Our algorithms for tracking the predictions of the best expert are simple adaptations of Vovk’s original algorithm for the single best expert case. As in the original algorithms, we keep one weight per expert, and spend O(1) time per weight in each trial.
A note on parameter tuning for online shifting algorithms
, 2003
"... ABSTRACT. In this short note, building on ideas of M. Herbster [2] we propose a method for automatically tuning the parameter of the FIXEDSHARE algorithm proposed by Herbster and Warmuth [3] in the context of online learning with shifting experts. We show that this can be done with a memory requir ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
ABSTRACT. In this short note, building on ideas of M. Herbster [2] we propose a method for automatically tuning the parameter of the FIXEDSHARE algorithm proposed by Herbster and Warmuth [3] in the context of online learning with shifting experts. We show that this can be done with a memory requirement of O(nT) and that the additional loss incurred by the tuning is the same as the loss incurred for estimating the parameter of a Bernoulli random variable. 1. SETTING How setting is the same as in [3]. We consider n experts and at each time period t = 1,...,T they make predictions and incur a loss L(t,i) (i being the index of the expert) which we model here as a negative loglikelihood, so that the probability that expert i makes a correct prediction at time t is e −L(t,i). We consider a Bayesian setting which we will use to motivate the updates. In each step one expert is supposed to be better than the other ones (i.e.