## Potential-based Algorithms in On-line Prediction and Game Theory

### Cached

### Download Links

- [www.econ.upf.es]
- [mercurio.srv.dsi.unimi.it]
- [mercurio.srv.di.unimi.it]
- [www.neurocolt.org]
- DBLP

### Other Repositories/Bibliography

Citations: | 32 - 4 self |

### BibTeX

@MISC{Cesa-Bianchi_potential-basedalgorithms,

author = {Nicolo Cesa-Bianchi and Gabor Lugosi},

title = {Potential-based Algorithms in On-line Prediction and Game Theory},

year = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

In this paper we show that several known algorithms for sequential prediction problems (including Weighted Majority and the quasi-additive family of Grove, Littlestone, and Schuurmans), for playing iterated games (including Freund and Schapire's Hedge and MW, as well as the -strategies of Hart and Mas-Colell), and for boosting (including AdaBoost) are special cases of a general decision strategy based on the notion of potential. By analyzing this strategy we derive known performance bounds, as well as new bounds, as simple corollaries of a single general theorem. Besides offering a new and unified view on a large family of algorithms, we establish a connection between potential-based analysis in learning and their counterparts independently developed in game theory. By exploiting this connection, we show that certain learning problems are instances of more general game-theoretic problems. In particular, we describe a notion of generalized regret and show its applications in learning theory.

### Citations

8564 |
Elements of Information Theory
- Cover, Thomas
- 2006
(Show Context)
Citation Context ...�(Rt) ≤ ln N +(η2/2)X 2 ∞Mt. To obtain a lower bound on ln �(Rt), consider any vector v0 of convex coefficients. Then we use the well-known250 N. CESA-BIANCHI AND G. LUGOSI “log-sum inequality”—(see =-=Cover & Thomas, 1991-=-, p. 29)—which implies that, for any vectors u, v ∈ R N of nonnegative numbers with ∑ N i=1 vi = 1, N∑ ln ui ≥ i=1 N∑ i=1 vi ln ui + H(v), where H(v) =− ∑ N i=1 vi ln vi is the entropy of v. Therefore... |

2309 | A decision-theoretic generalization of on-line learning and an application to boosting
- Freund, Schapire
- 1997
(Show Context)
Citation Context ...introducing randomization). In learning theory, algorithms based on the exponential potential have been intensively studied and applied to a variety of problems—see, e.g., (Cesa-Bianchi et al., 1997; =-=Freund & Schapire, 1997-=-; Littlestone & Warmuth, 1994; Vovk, 1990, 1998).POTENTIAL-BASED ALGORITHMS 245 If r t ∈ [−1, 1] N for all t, then the choice p = 2lnN for the polynomial potential yields the bound max 1≤i≤N Ri,t ≤ √... |

669 | The weighted majority algorithm
- LITTLESTONE, WARMUTH
- 1994
(Show Context)
Citation Context ...). In learning theory, algorithms based on the exponential potential have been intensively studied and applied to a variety of problems—see, e.g., (Cesa-Bianchi et al., 1997; Freund & Schapire, 1997; =-=Littlestone & Warmuth, 1994-=-; Vovk, 1990, 1998).POTENTIAL-BASED ALGORITHMS 245 If r t ∈ [−1, 1] N for all t, then the choice p = 2lnN for the polynomial potential yields the bound max 1≤i≤N Ri,t ≤ √ (2 ln N − 1) ( t∑ N∑ s=1 i=1... |

413 | Large margin classification using the perceptron algorithm
- Freund, Schapire
- 1999
(Show Context)
Citation Context ... somewhat stronger than ours was proven by Gentile (2001). For an arbitrary sequence (x1, y1),...,(xt, yt) of labeled attribute vectors, let Dt = ∑ t s=1 max{0,γ − ys xs · v0} be the total deviation (=-=Freund & Schapire, 1999-=-b; Gentile, 2001; Gentile & Warmuth, 1999) of v0 ∈ R N with respect to a given margin γ>0. Each term in the sum defining Dt tells whether, and by how much, the linear threshold classifier based on wei... |

315 |
Non-negative Matrices and Markov Chains
- Seneta
- 1981
(Show Context)
Citation Context ...t) ∑N i=1 ∇i�(Rt−1)Ai(k, . t) Then condition (11) is implied by M p = p. As∇� ≥ 0, M is nonnegative, and thus the eigenvector equation M p = p has a positive solution by the Perron-Frobenius theorem (=-=Seneta, 1981-=-). Acknowledgments Both authors acknowledge partial support of ESPRIT Working Group EP 27150, Neural and Computational Learning II (NeuroCOLT II). The work of the second author was also supported by D... |

314 | How to use expert advice
- Cesa-Bianchi, Freund, et al.
- 1997
(Show Context)
Citation Context ...ing this choice amounts to introducing randomization). In learning theory, algorithms based on the exponential potential have been intensively studied and applied to a variety of problems—see, e.g., (=-=Cesa-Bianchi et al., 1997-=-; Freund & Schapire, 1997; Littlestone & Warmuth, 1994; Vovk, 1990, 1998).POTENTIAL-BASED ALGORITHMS 245 If r t ∈ [−1, 1] N for all t, then the choice p = 2lnN for the polynomial potential yields the... |

286 |
Principles of Neurodynamics
- Rosenblatt
- 1962
(Show Context)
Citation Context ...estone, and Schuurmans (2001), who used it to define and analyze a new family of algorithms for solving on-line binary classification problems. This family includes, as special cases, the Perceptron (=-=Rosenblatt, 1962-=-) and the zero-threshold Winnow algorithm (Littlestone, 1989). Finally, our abstract decision problem bears some similarities with Schapire’s drifting game (Schapire, 2001). The rest of the paper is o... |

260 |
The relaxation method for finding common points of convex sets and its application to the solution of problems in convex programming
- Bregman
- 1967
(Show Context)
Citation Context ... (1997) and Kivinen and Warmuth (2001), leads to mistake bounds essentiallyPOTENTIAL-BASED ALGORITHMS 251 equivalent to ours. This alternative analysis is based on the notion on Bregman divergences (=-=Bregman, 1967-=-). In this section we re-derive a (very general) form of these mistake bounds starting from our notion of potential. Concrete bounds for particular choices of the potential functions can be also deriv... |

247 | Context-sensitive learning methods for text categorization - Cohen, Singer - 1999 |

246 |
Aggregating strategies
- VOVK
- 1990
(Show Context)
Citation Context ...thms based on the exponential potential have been intensively studied and applied to a variety of problems—see, e.g., (Cesa-Bianchi et al., 1997; Freund & Schapire, 1997; Littlestone & Warmuth, 1994; =-=Vovk, 1990-=-, 1998).POTENTIAL-BASED ALGORITHMS 245 If r t ∈ [−1, 1] N for all t, then the choice p = 2lnN for the polynomial potential yields the bound max 1≤i≤N Ri,t ≤ √ (2 ln N − 1) ( t∑ N∑ s=1 i=1 |ri,s| 2lnN... |

243 | Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms - Rosenblatt - 1962 |

220 | A simple adaptive procedure leading to correlated equilibrium
- Hart, Mas-Colell
- 2000
(Show Context)
Citation Context ... practical application of the specialists framework. Example (Internal regret). Here we discuss in detail the special case of the problem of minimizing the so-called internal (or conditional) regret (=-=Hart & Mas-Colell, 2000-=-). Foster and Vohra (1999) survey this notion of a regret and its relationship with the external regret (8). Minimization of the internal regret plays a key role in the construction of adaptive game-p... |

153 |
An analog of the minimax theorem for vector payoffs
- Blackwell
- 1956
(Show Context)
Citation Context ...ient descent approach to sequential decision problems is not new. A prominent example of a decision strategy of this type is the one used by Blackwell to prove his celebrated approachability theorem (=-=Blackwell, 1956-=-), generalizing to vector-valued payoffs von Neumann’s minimax theorem. The application of Blackwell’s strategy to sequential decision problems, and its generalization to arbitrary potentials, is due ... |

131 | Adaptive game playing using multiplicative weights
- Freund, Schapire
- 1999
(Show Context)
Citation Context ... somewhat stronger than ours was proven by Gentile (2001). For an arbitrary sequence (x1, y1),...,(xt, yt) of labeled attribute vectors, let Dt = ∑ t s=1 max{0,γ − ys xs · v0} be the total deviation (=-=Freund & Schapire, 1999-=-b; Gentile, 2001; Gentile & Warmuth, 1999) of v0 ∈ R N with respect to a given margin γ>0. Each term in the sum defining Dt tells whether, and by how much, the linear threshold classifier based on wei... |

131 |
On convergence proofs on perceptrons
- Novikoff
- 1962
(Show Context)
Citation Context ...tive algorithm of Grove, Littlestone and Schuurmans (whose specific instances are the p-norm Perceptron (Gentile, 2001; Grove, Littlestone, & Schuurmans, 1997), the classical Perceptron (Block, 1962; =-=Novikoff, 1962-=-; Rosenblatt, 1962), and the zero-threshold Winnow algorithm (Littlestone, 1989)) is a special case of our general decision strategy. Then, we derive performance bounds as corollaries of Theorem 1. We... |

112 | Regret in the on-line decision problem - Foster, Vohra - 1999 |

108 |
Mistake bounds and logarithmic linear-threshold learning algorithms
- Littlestone
- 1989
(Show Context)
Citation Context ...alyze a new family of algorithms for solving on-line binary classification problems. This family includes, as special cases, the Perceptron (Rosenblatt, 1962) and the zero-threshold Winnow algorithm (=-=Littlestone, 1989-=-). Finally, our abstract decision problem bears some similarities with Schapire’s drifting game (Schapire, 2001). The rest of the paper is organized as follows. In Section 2 a general result is derive... |

103 | A game of prediction with expert advice - Vovk - 1995 |

96 |
Consistency and cautious fictitious play
- Fudenberg, Levine
- 1995
(Show Context)
Citation Context ...Grove, Littlestone, and Schuurmans (1997), where it was used to define the p-norm Perceptron. The exponential potential is also reminiscent of the smooth fictitious play approach used in game theory (=-=Fudenberg & Levine, 1995-=-) (in fictitious play, the player chooses the pure strategy that is best given the past distribution of the adversary’s plays; smoothing this choice amounts to introducing randomization). In learning ... |

95 | Approximation to Bayes risk in repeated plays,” in Contributions to the Theory of Games, voL HI, Annals of Mathematics Studies - Hannan |

92 | Using and combining predictors that specialize
- Freund, Schapire, et al.
- 1997
(Show Context)
Citation Context ...tion permits us to consider a much wider family of prediction problems. Examples include variants of the learning with experts framework, such as “shifting experts” or the more general “specialists” (=-=Freund et al., 1997-=-). In the specialists framework, the activation function Ai(k, t) depends arbitrarily on the round index t but not on the actual predictor’s guess k. This setup may be useful to model prediction scena... |

86 | Calibrated learning and correlated equilibrium - Foster, Vohra - 1997 |

81 | General convergence results for linear discriminant updates
- Grove, Littlestone, et al.
- 2001
(Show Context)
Citation Context ...2). 4. The quasi-additive algorithm In this section, we show that the quasi-additive algorithm of Grove, Littlestone and Schuurmans (whose specific instances are the p-norm Perceptron (Gentile, 2001; =-=Grove, Littlestone, & Schuurmans, 1997-=-), the classical Perceptron (Block, 1962; Novikoff, 1962; Rosenblatt, 1962), and the zero-threshold Winnow algorithm (Littlestone, 1989)) is a special case of our general decision strategy. Then, we d... |

78 |
The Perceptron: A model for brain functioning
- Block
- 1962
(Show Context)
Citation Context ...he quasi-additive algorithm of Grove, Littlestone and Schuurmans (whose specific instances are the p-norm Perceptron (Gentile, 2001; Grove, Littlestone, & Schuurmans, 1997), the classical Perceptron (=-=Block, 1962-=-; Novikoff, 1962; Rosenblatt, 1962), and the zero-threshold Winnow algorithm (Littlestone, 1989)) is a special case of our general decision strategy. Then, we derive performance bounds as corollaries ... |

74 | Sequential prediction of individual sequences under general loss functions - Haussler, Kivinen, et al. - 1998 |

71 | 2001), “A General Class of Adaptive Strategies
- Hart, Mas-Colell
(Show Context)
Citation Context ...ched by the drifting point when the decision maker uses a strategy satisfying condition (1). This result is inspired by, and partially builds on, Hart and Mas-Colell’s analysis of their �-strategies (=-=Hart & Mas-Colell, 2001-=-) for playing iterated games and the analysis of quasi-additive algorithms for binary classification by Grove, Littlestone, and Schuurmans (1997). Theorem 1. Let � be a twice differentiable additive p... |

71 | Relative loss bounds for multidimensional regression problems - Kivinen, Warmuth - 2001 |

63 | Competitive on-line statistics - Vovk |

62 | Adaptive and self-confident on-line learning algorithms - Auer, Cesa-Bianchi, et al. |

57 | Averaging expert predictions - KIVINEN, WARMUTH - 1999 |

57 | A second-order perceptron algorithm - Cesa-Bianchi, Conconi, et al. - 2005 |

40 | Analysis of two gradient-based algorithms for on-line regression - Cesa-Bianchi - 1999 |

36 | Linear hinge loss and average margin
- Gentile, Warmuth
- 1999
(Show Context)
Citation Context ...y Gentile (2001). For an arbitrary sequence (x1, y1),...,(xt, yt) of labeled attribute vectors, let Dt = ∑ t s=1 max{0,γ − ys xs · v0} be the total deviation (Freund & Schapire, 1999b; Gentile, 2001; =-=Gentile & Warmuth, 1999-=-) of v0 ∈ R N with respect to a given margin γ>0. Each term in the sum defining Dt tells whether, and by how much, the linear threshold classifier based on weight vector v0 missed to classify, to with... |

32 | Conditional universal consistency - Fudenberg, Levine - 1999 |

23 | Drifting games
- Schapire
- 1999
(Show Context)
Citation Context ...ial cases, the Perceptron (Rosenblatt, 1962) and the zero-threshold Winnow algorithm (Littlestone, 1989). Finally, our abstract decision problem bears some similarities with Schapire’s drifting game (=-=Schapire, 2001-=-). The rest of the paper is organized as follows. In Section 2 a general result is derived for the performance of sequential decision strategies satisfying condition (1), and the special cases of the ... |

21 | A wide range no-regret theorem - Lehrer - 2003 |

13 | Large margin classi using the perceptron algorithm - Freund, Schapire - 1998 |

13 | On convergence proofs for perceptrons - Noviko - 1962 |

12 | On convergence proofs on Perceptrons - Novikov - 1962 |

8 | Mistake Bounds and Linear-threshold Learning Algorithms - Littlestone - 1989 |

8 | Continuous and discrete-time nonlinear gradient descent: Relative loss bounds and convergence - Warmuth, Jagota - 1997 |

5 | Adaptive and self-con on-line learning algorithms - Auer, Cesa-Bianchi, et al. - 2002 |

2 | Large margin classication using the Perceptron algorithm - Freund, Schapire - 1999 |

1 | The robustness of the p-norm algorithms. Manuscript, 2001. An extended abstract (co-authored with N. Littlestone) appeared - Gentile - 1999 |

1 | A wide range no-regret theorem. Unpublished manuscript - Lehrer - 2000 |

1 | Vovk Competitive on-line statistics - G - 2001 |

1 |
The Robustness of the p-norm Algorithms. An extended abstract (co-authored with N. Littlestone) appeared
- Gentile
- 2001
(Show Context)
Citation Context ...nd Gentile (2002). 4. The quasi-additive algorithm In this section, we show that the quasi-additive algorithm of Grove, Littlestone and Schuurmans (whose specific instances are the p-norm Perceptron (=-=Gentile, 2001-=-; Grove, Littlestone, & Schuurmans, 1997), the classical Perceptron (Block, 1962; Novikoff, 1962; Rosenblatt, 1962), and the zero-threshold Winnow algorithm (Littlestone, 1989)) is a special case of o... |

1 | ml.tex; 2/08/2002; 13:25; p.26 Potential-based Algorithms 27 - Springer - 2001 |