Results 1  10
of
10
Online learning: Beyond regret
, 2011
"... We study online learnability of a wide class of problems, extending the results of [25] to general notions of performance measure well beyond external regret. Our framework simultaneously captures such wellknown notions as internal and general Φregret, learning with nonadditive global cost funct ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
We study online learnability of a wide class of problems, extending the results of [25] to general notions of performance measure well beyond external regret. Our framework simultaneously captures such wellknown notions as internal and general Φregret, learning with nonadditive global cost functions, Blackwell’s approachability, calibration of forecasters, adaptive regret, and more. We show that learnability in all these situations is due to control of the same three quantities: a martingale convergence term, a term describing the ability to perform well if future is known, and a generalization of sequential Rademacher complexity, studied in [25]. Since we directly study complexity of the problem instead of focusing on efficient algorithms, we are able to improve and extend many known results which have been previously derived via an algorithmic construction. 1
Blackwell approachability and lowregret learning are equivalent
 In Proceedings of the 25th Annual Conference on Learning Theory (COLT ’12
, 2012
"... We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. We show that Blackwell’s result is equivalent, via efficient reductions, to the existence of “noregret ” algorithms for Online Linear Optimization. Indeed, we show that any algorithm for one such ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. We show that Blackwell’s result is equivalent, via efficient reductions, to the existence of “noregret ” algorithms for Online Linear Optimization. Indeed, we show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide a useful application of this reduction: the first efficient algorithm for calibrated forecasting. 1 ar X iv
Unified Algorithms for Online Learning and Competitive Analysis
 25TH ANNUAL CONFERENCE ON LEARNING THEORY
, 2012
"... Online learning and competitive analysis are two widely studied frameworks for online decisionmaking settings. Despite the frequent similarity of the problems they study, there are significant differences in their assumptions, goals and techniques, hindering a unified analysis and richer interplay ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Online learning and competitive analysis are two widely studied frameworks for online decisionmaking settings. Despite the frequent similarity of the problems they study, there are significant differences in their assumptions, goals and techniques, hindering a unified analysis and richer interplay between the two. In this paper, we provide several contributions in this direction. We provide a single unified algorithm which by parameter tuning, interpolates between optimal regret for learning from experts (in online learning) and optimal competitive ratio for the metrical task systems problem (MTS) (in competitive analysis), improving on the results of Blum and Burch (1997). The algorithm also allows us to obtain new regret bounds against “drifting” experts, which might be of independent interest. Moreover, our approach allows us to go beyond experts/MTS, obtaining similar unifying results for structured action sets and “combinatorial experts”, whenever the setting has a certain matroid structure.
Online (budgeted) social choice
 In Proceedings of AAAI2014, 1456–1462
, 2014
"... Abstract We consider a classic social choice problem in an online setting. In each round, a decision maker observes a single agent's preferences over a set of m candidates, and must choose whether to irrevocably add a candidate to a selection set of limited cardinality k. Each agent's (po ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract We consider a classic social choice problem in an online setting. In each round, a decision maker observes a single agent's preferences over a set of m candidates, and must choose whether to irrevocably add a candidate to a selection set of limited cardinality k. Each agent's (positional) score depends on the candidates in the set when he arrives, and the decisionmaker's goal is to maximize average (over all agents) score. We prove that no algorithm (even randomized) can achieve an approximation factor better than O( log log m log m ). In contrast, if the agents arrive in random order, we present a (1− 1 e −o(1))approximate algorithm, matching a lower bound for the offline problem. We show that improved performance is possible for natural input distributions or scoring rules. Finally, if the algorithm is permitted to revoke decisions at a fixed cost, we apply regretminimization techniques to achieve approximation 1 − 1 e − o(1) even for arbitrary inputs.
Blackwell Approachability and NoRegret Learning are Equivalent
"... We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimization for Online Linear Optimization. We show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide one novel application of this reduction: the first efficient algorithm for calibrated forecasting. 1.
Regret Minimization and Job Scheduling
"... Abstract. Regret minimization has proven to be a very powerful tool in both computational learning theory and online algorithms. Regret minimization algorithms can guarantee, for a single decision maker, a near optimal behavior under fairly adversarial assumptions. I will discuss a recent extensions ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Regret minimization has proven to be a very powerful tool in both computational learning theory and online algorithms. Regret minimization algorithms can guarantee, for a single decision maker, a near optimal behavior under fairly adversarial assumptions. I will discuss a recent extensions of the classical regret minimization model, which enable to handle many different settings related to job scheduling, and guarantee the near optimal online behavior. 1 Regret Minimization Consider a single decision maker attempting to optimize it performance in face of an uncertain environment. This simple online setting has attracted attention from multiple disciplines, including operations research, game theory, and computer science. In computer science, computational learning theory and online algorithms both focus on this task from different perspectives. I will concentrate only on a certain facet of this general issue of decision making, and consider settings related to regret minimization, where the performance of the online decision
Learning with Global Cost in Stochastic Environments
"... We consider an online learning setting where at each time step the decision maker has to choose how to distribute the future loss between k alternatives, and then observes the loss of each alternative, where the losses are assumed to come from a joint distribution. Motivated by load balancing and jo ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
We consider an online learning setting where at each time step the decision maker has to choose how to distribute the future loss between k alternatives, and then observes the loss of each alternative, where the losses are assumed to come from a joint distribution. Motivated by load balancing and job scheduling, we consider a global cost function (over the losses incurred by each alternative), rather than a summation of the instantaneous losses as done traditionally in online learning. Specifically, we consider the global cost functions: (1) the makespan (the maximum over the alternatives) and (2) the Ld norm (over the alternatives) for d> 1. We design algorithms that guarantee logarithmic regret for this setting, where the regret is measured with respect to the best static decision (one selects the same distribution over alternatives at every time step). We also show that the least loaded machine, a natural algorithm for minimizing the makespan, has a regret of the order of √ T. We complement our theoretical findings with supporting experimental results. 1
24th Annual Conference on Learning Theory Online Learning: Beyond Regret
"... We study online learnability of a wide class of problems, extending the results of Rakhlin et al. (2010a) to general notions of performance measure well beyond external regret. Our framework simultaneously captures such wellknown notions as internal and general Φregret, learning with nonadditive ..."
Abstract
 Add to MetaCart
We study online learnability of a wide class of problems, extending the results of Rakhlin et al. (2010a) to general notions of performance measure well beyond external regret. Our framework simultaneously captures such wellknown notions as internal and general Φregret, learning with nonadditive global cost functions, Blackwell’s approachability, calibration of forecasters, and more. We show that learnability in all these situations is due to control of the same three quantities: a martingale convergence term, a term describing the ability to perform well if future is known, and a generalization of sequential Rademacher complexity, studied in Rakhlin et al. (2010a). Since we directly study complexity of the problem instead of focusing on efficient algorithms, we are able to improve and extend many known results which have been previously derived via an algorithmic construction. 1.
Approachability in unknown games: Online learning meets multiobjective optimization
"... In the standard setting of approachability there are two players and a target set. The players play a repeated vectorvalued game where one of them wants to have the average vectorvalued payoff converge to the target set which the other player tries to exclude. We revisit the classical setting and ..."
Abstract
 Add to MetaCart
(Show Context)
In the standard setting of approachability there are two players and a target set. The players play a repeated vectorvalued game where one of them wants to have the average vectorvalued payoff converge to the target set which the other player tries to exclude. We revisit the classical setting and consider the setting where the player has a preference relation between target sets: she wishes to approach the smallest (“best”) set possible given the observed average payoffs in hindsight. Moreover, as opposed to previous works on approachability, and in the spirit of online learning, we do not assume that there is a known game structure with actions for two players. Rather, the player receives an arbitrary vectorvalued reward vector at every round. We show that it is impossible, in general, to approach the best target set in hindsight. We further propose a concrete strategy that approaches a nontrivial relaxation of the bestinhindsight given the actual rewards. Our approach does not require projection onto a target set and amounts to switching between scalar regret minimization algorithms that are performed in episodes.