Results 1  10
of
24
Improved secondorder bounds for prediction with expert advice
 In COLT
, 2005
"... Abstract. This work studies external regret in sequential prediction games with both positive and negative payoffs. External regret measures the difference between the payoff obtained by the forecasting strategy and the payoff of the best action. In this setting, we derive new and sharper regret bou ..."
Abstract

Cited by 46 (9 self)
 Add to MetaCart
Abstract. This work studies external regret in sequential prediction games with both positive and negative payoffs. External regret measures the difference between the payoff obtained by the forecasting strategy and the payoff of the best action. In this setting, we derive new and sharper regret bounds for the wellknown exponentially weighted average forecaster and for a new forecaster with a different multiplicative update rule. Our analysis has two main advantages: first, no preliminary knowledge about the payoff sequence is needed, not even its range; second, our bounds are expressed in terms of sums of squared payoffs, replacing larger firstorder quantities appearing in previous bounds. In addition, our most refined bounds have the natural and desirable property of being stable under rescalings and general translations of the payoff sequence. 1.
Regret minimization under partial monitoring
 MATHEMATICS OF OPERATIONS RESEARCH
, 2004
"... We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose perroun ..."
Abstract

Cited by 33 (7 self)
 Add to MetaCart
We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for this games; that is, randomized playing strategies whose perround regret vanishes with probability one as the number n of game rounds goes to infinity. We prove a general lower bound of Ω(n^−1/3) on the convergence rate of the regret, and exhibit a specific strategy that attains this rate on any game for which a Hannan consistent player exists.
Global Nash convergence of Foster and Young’s regret testing
 Games and Economic Behavior
, 2007
"... We construct an uncoupled randomized strategy of repeated play such that, if every player plays according to it, mixed action profiles converge almost surely to a Nash equilibrium of the stage game. The strategy requires very little in terms of information about the game, as players ’ actions are ba ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
We construct an uncoupled randomized strategy of repeated play such that, if every player plays according to it, mixed action profiles converge almost surely to a Nash equilibrium of the stage game. The strategy requires very little in terms of information about the game, as players ’ actions are based only on their own past payoffs. Moreover, in a variant of the procedure, players need not know that there are other players in the game and that payoffs are determined through other players ’ actions. The procedure works for finite generic games and is based on appropriate modifications of a simple stochastic learning rule introduced by Foster and Young [12]. Keywords Regret testing; Regretbased learning; Random search; Stochastic dynamics; Uncoupled dynamics; Global convergence to
The communication complexity of uncoupled Nash equilibrium procedures
 Games and Economic Behavior
, 2006
"... We study the question of how long it takes players to reach a Nash equilibrium in uncoupled setups, where each player initially knows only his own payoff function. We derive lower bounds on the communication complexity of reaching a Nash equilibrium, i.e., on the number of bits that need to be trans ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
We study the question of how long it takes players to reach a Nash equilibrium in uncoupled setups, where each player initially knows only his own payoff function. We derive lower bounds on the communication complexity of reaching a Nash equilibrium, i.e., on the number of bits that need to be transmitted, and thus also on the required number of steps. Specifically, we show lower bounds that are exponential in the number of players in each one of the following cases: (1) reaching a pure Nash equilibrium; (2) reaching a pure Nash equilibrium in a Bayesian setting; and (3) reaching a mixed Nash equilibrium. We then show that, in contrast, the communication complexity of reaching a correlated equilibrium is polynomial in the number of players.
A Parameterfree Hedging Algorithm
"... We study the problem of decisiontheoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using onlinelearning in prac ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
We study the problem of decisiontheoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using onlinelearning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameterfree algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimallytuned parameters, according to previous notions of regret. 1
EXPONENTIAL WEIGHT ALGORITHM IN CONTINUOUS TIME
, 2006
"... The exponential weight algorithm has been introduced in the framework of discrete time online problems. Given an observed process {Xm}m=1,2... the input at stage m + 1 is an exponential function of the sum Sm = Pm ℓ=1 Xℓ. We define the analog algorithm for a continuous time process Xt and prove s ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
The exponential weight algorithm has been introduced in the framework of discrete time online problems. Given an observed process {Xm}m=1,2... the input at stage m + 1 is an exponential function of the sum Sm = Pm ℓ=1 Xℓ. We define the analog algorithm for a continuous time process Xt and prove similar properties in terms of external or internal consistency. We then deduce results for discrete time from their counterpart in continuous time. Finally we compare to an other continuous time approximation of a discrete time exponential algorithm based on the average sum Sm/m.
Recognition Tasks are Imitation Games
"... There is need for more formal specification of recognition tasks. Currently, it is common to use labeled training samples to illustrate the task to be performed. The mathematical theory of games may provide more formal and complete definitions for recognition tasks. ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
There is need for more formal specification of recognition tasks. Currently, it is common to use labeled training samples to illustrate the task to be performed. The mathematical theory of games may provide more formal and complete definitions for recognition tasks.