MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

The Weighted Majority Algorithm (1992) [441 citations — 38 self]

by Nick Littlestone ,  Manfred K. Warmuth
Add To MetaCart

Abstract:

We study the construction of prediction algorithms in a situation in which a learner faces a sequence of trials, with a prediction to be made in each, and the goal of the learner is to make few mistakes. We are interested in the case that the learner has reason to believe that one of some pool of known algorithms will perform well, but the learner does not know which one. A simple and effective method, based on weighted voting, is introduced for constructing a compound algorithm in such a circumstance. We call this method the Weighted Majority Algorithm. We show that this algorithm is robust in the presence of errors in the data. We discuss various versions of the Weighted Majority Algorithm and prove mistake bounds for them that are closely related to the mistake bounds of the best algorithms of the pool. For example, given a sequence of trials, if there is an algorithm in the pool A that makes at most m mistakes then the Weighted Majority Algorithm will make at most c(log jAj + m) mi...

Citations

1422 Probability and Measure – Billingsley - 1995
528 Learnability and the Vapnik -Chervonenkis dimension – Blumer, Ehrenfeucht, et al. - 1989
503 Learning quickly when irrelevant attributes abound: A new linear-threshold algorithms – Littlestone - 1988
92 Mistake bounds and logarithmic linear-threshold learning algorithms – Littlestone - 1989
88 Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension – Haussler, Kearns, et al. - 1994
60 From on-line to batch learning – Littlestone - 1989
45 On the prediction of general recursive functions – Barzdin, Freivald - 1972
45 A statistical approach to learning and generalization in layered neural networks – Levin, Tishby, et al. - 1990
34 Learning nested differences of intersection-closedclasses – Helmbold, Sloan, et al. - 1989
32 Learning Probabilistic Prediction Functions – DeSantis, Markowski, et al. - 1992
14 Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise – Opper, Haussler - 1991
9 Trade-off among parameters affecting inductive inference – Freivalds, Smith, et al. - 1989
3 Freivalds. Prognozirovanie i predel'nyi sintez effektivno perechislimykh klassov funktsii (prediction and limit synthesis of effectively enumerable classes of functions – Barzdin, V - 1974