## Combining Initial Segments of Lists

Citations: | 1 - 0 self |

### BibTeX

@MISC{Warmuth_combininginitial,

author = {Manfred K. Warmuth and Wouterm. Koolen and Davidp. Helmbold},

title = {Combining Initial Segments of Lists},

year = {}

}

### OpenURL

### Abstract

Abstract. We propose a new way to build a combined list from K base lists, each containing N items. A combined list consists of top segments of various sizes from each base list so that the total size of all top segments equals N. A sequence of item requests is processed and the goal is to minimize the total number of misses. That is, we seek to build a combined list that contains all the frequently requested items. We first consider the special case of disjoint base lists. There, we design an efficient algorithm that computes the best combined list for a given sequence of requests. In addition, we develop a randomized online algorithm whose expected number of misses is close to that of the best combined list chosen in hindsight. We prove lower bounds that show that the expected number of misses of our randomized algorithm is close to the optimum. In the presence of duplicate items, we show that computing the best combined list is NP-hard. We show that our algorithms still apply to a linearized notion of loss in this case. We expect that this new way of aggregating lists will find many ranking applications. 1

### Citations

2492 | A decision-theoretic generalization of on-line learning and an application to boosting
- Freund, Schapire
- 1995
(Show Context)
Citation Context ... perform well experimentally [Sca07] and beat the previous deterministic heuristics developed in [MM03, MM04]. Probabilistic Algorithms Based on Sampling. Our approach is to apply the Hedge algorithm =-=[FS97]-=- to the exponentially many combined lists. It uses the same exponential weights that are proportional to β M(c) for list c but now we simply pick a combined list at random according to the mixture coe... |

701 | The weighted majority algorithm
- Littlestone, Warmuth
- 1994
(Show Context)
Citation Context ...e item did not appear on either base list is trivial. Since at least half of the total weight is multiplied by β, an analysis paralleling the analysis of the deterministic Weighted Majority algorithm =-=[LW94]-=- gives the following loss bound after tuning β based on N and the budget B, i.e. the loss of the best combined list in hindsight: (√ ) 2B + O B ln(N +1)+ln(N+1) . There are two problems with this dete... |

687 | Learning quickly when irrelevant attributes abound: A new linear threshold algorithm
- Littlestone
- 1988
(Show Context)
Citation Context ...c. This approach parallels how NP-hardness was avoided when learning disjunctions. There the 0/1 loss was replaced by the attribute loss which decomposes into a sum of the literals in the disjunction =-=[Lit88]-=-. This linearization of the loss is extensively discussed in [TW03]. Our efficient implementation of Hedge is based on the decomposition of the miss count and remains unchanged in the setting when ite... |

204 | Tracking the best expert
- Herbster, Warmuth
- 1995
(Show Context)
Citation Context ...GWBA02]. It is a tedious but straightforward exercise to mix in a bit of the uniform distribution into the dynamic programming algorithm of Section 4, thus implementing the Fixed Share algorithm from =-=[HW98]-=-. The method is again based on recurrences, and maintains all weights implicitly. However, it adds an O(T 2 ) factor to the running time of our current algorithm. We don’t know how to do the fancier m... |

138 | Efficient algorithms for online decision problems
- Kalai, Vempala
(Show Context)
Citation Context ... in the conclusion Section 7. Hedge vs Follow the Perturbed Leader. The two main flavors of efficient online algorithms for dealing with a linear loss are Hedge [FS97] and Follow the Perturbed Leader =-=[KV05]-=-. Hedge based algorithms usually have slightly better loss bounds, whereas FPL-type algorithms typically have computational advantages. In this paper, completely contrary to those general rules, we pr... |

109 | A game of prediction with expert advice - Vovk - 1998 |

64 | Path kernels and multiplicative updates
- Takimoto, Warmuth
(Show Context)
Citation Context ...orithm in relation to the loss of the best list chosen in hindsight. Our contribution lies in the efficient implementation of the sampling version of the Hedge algorithm for combined lists. Following =-=[TW03]-=-, we crucially use the fact that in our application, the loss M(c) of a combined list c decomposes into a sum: M(c) = ∑K k=1 Mck,k. Thus the weight of c is proportional to the product ∏K k=1 βMck ,k. ... |

61 | Tracking a small set of experts by mixing past posteriors - Bousquet, Warmuth - 2002 |

53 | Adaptive disk spin-down for mobile computers - Helmbold, Long, et al. |

49 | Competing in the dark: An efficient algorithm for bandit linear optimization - Abernethy, Hazan, et al. |

31 | Adaptive caching by refetching
- Gramacy, Warmuth, et al.
- 2003
(Show Context)
Citation Context ...ves the performance of its combined cache. Previous Work. Methods for building a real cache by combining a number of virtually maintained caching strategies using exponential weights were explored in =-=[GWBA02]-=-. The employed heuristics performed well experimentally. However no performance bounds were proven. The idea for building a real cache by combining the tops of two virtual lists was first developed in... |

31 | Outperforming LRU with an Adaptive Replacement Cache Algorithm - Megiddo, Modha - 2004 |

26 |
Tukey J.W., An algorithm for the machine calculation of complex Fourier series
- Cooley
- 1965
(Show Context)
Citation Context ... <k≤ K where ∗ denotes real discrete convolution, i.e. (x ∗ y)n = ∑ i xn−iyi. Since the convolution of two vectors of length N each can be performed in time O(N ln N) using the Fast Fourier Transform =-=[CT65]-=-, we can tabulate Z⋆,⋆ in time O(KN ln N). 2 2 A similar approach was used in [KM05] to obtain a similar speedup in the completely different context of computing the stochastic complexity of the Multi... |

22 | Adaptive online prediction by following the perturbed leader
- Hutter, Poland
(Show Context)
Citation Context ...ist chosen in hindsight. At trial t, FPL adds a random perturbation matrix to M <t and chooses the best combined list with respect to that perturbed miss matrix. FPL has slightly weaker regret bounds =-=[HP05]-=- than Hedge, but is usually faster. Surprisingly we show in the next section that for combined list the Hedge algorithm is actually faster: it requires O(KN ln N) time instead of O( KN2 ln N )forFPL. ... |

21 | A gang of bandits - Cesa-Bianchi, Gentile, et al. - 2013 |

16 | Hedging structured concepts
- Koolen, Warmuth
- 2010
(Show Context)
Citation Context ...[0,K]. We also224 M.K. Warmuth, W.M. Koolen, and D.P. Helmbold discuss an alternate method for solving the problem with duplicates based on the online shortest path problem and using Component Hedge =-=[KWK10]-=-. This algorithm achieves regret O( √ BK log N) for the problem with duplicates (i.e. no additional range factor). However we do not know how to bound the running time for the iterative projection com... |

14 | Tracking the best of many experts - György, Linder, et al. |

5 | Regret minimization for online buffering problems using the weighted majority algorithm
- Geulen, Voecking, et al.
- 2010
(Show Context)
Citation Context ... costs and this is particularly egregious for the probabilistic algorithms. The Shrinking Dartboard Algorithm, a method for lazily updating the expert followed by the Hedge algorithm was developed in =-=[GVW10]-=-, and seems quite readily applicable to our setting. The Follow-the-Lazy-Leader algorithm [KV05] is another method requiring fewer updates. Also some good practical heuristics for reloading were given... |

5 | A fast normalized maximum likelihood algorithm for multinomial data
- Kontkanen, Myllymäki
- 2005
(Show Context)
Citation Context ...the convolution of two vectors of length N each can be performed in time O(N ln N) using the Fast Fourier Transform [CT65], we can tabulate Z⋆,⋆ in time O(KN ln N). 2 2 A similar approach was used in =-=[KM05]-=- to obtain a similar speedup in the completely different context of computing the stochastic complexity of the Multinomial model.Combining Initial Segments of Lists 227 Sampling a list from c ∈ �n,k ... |

5 | One up on LRU. ;login: The - Megiddo, Modha - 2003 |

3 | Adaptive caching by experts - Gramacy - 2003 |

2 |
Optimal strategies for random walks
- Abernethy, Warmuth, et al.
- 2008
(Show Context)
Citation Context .... The game ends when one expert has made at least B + 1 mistakes. Let g2(B) bethe largest total loss the Environment can force the Algorithm to incur in this game. It can be shown based on results in =-=[AWY08]-=- that for any integer budget B √ B g2(B) ≥ B + π . We now prove a lower bound on the regret in the K = 2 list case of √ B log2(N +1) B + π against any randomized algorithm A. We do this by constructin... |

1 |
An adaptive caching algorithm
- Scalisi
- 2007
(Show Context)
Citation Context ...cause we show in this paper that any deterministic algorithm can be forced to incur loss KB,whereB is the loss of the best. Nevertheless, algorithms based on this approach perform well experimentally =-=[Sca07]-=- and beat the previous deterministic heuristics developed in [MM03, MM04]. Probabilistic Algorithms Based on Sampling. Our approach is to apply the Hedge algorithm [FS97] to the exponentially many com... |

1 |
Making online predictions from k-lists. Project report for
- Sen, Scalisi
- 2007
(Show Context)
Citation Context ...Bounds In the noise-free case (there is a combined list with no misses), the minimax regret for deterministic algorithms was proven to be Ω(K ln N) with the first author’s students in a class project =-=[SS07]-=-. Here we focus on the noisy case. We begin with a simple adversary argument against any deterministic algorithm and then turn to lower bounds for randomized algorithms. We first show a simple lower b... |

1 | Necklaces, convolutions, and x + y - Bremner, Chan, et al. - 2006 |

1 | Nicolò Cesa-Bianchi and Gábor Lugosi. Combinatorial bandits - Tukey - 2009 |

1 | Gbor Lugosi. Tracking the best of many experts - Gyrgy, Linder - 2005 |