Results 1 - 10
of
10
Online learning: Beyond regret
, 2011
"... We study online learnability of a wide class of problems, extending the results of [25] to general no-tions of performance measure well beyond external regret. Our framework simultaneously captures such well-known notions as internal and general Φ-regret, learning with non-additive global cost funct ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
(Show Context)
We study online learnability of a wide class of problems, extending the results of [25] to general no-tions of performance measure well beyond external regret. Our framework simultaneously captures such well-known notions as internal and general Φ-regret, learning with non-additive global cost functions, Blackwell’s approachability, calibration of forecasters, adaptive regret, and more. We show that learn-ability in all these situations is due to control of the same three quantities: a martingale convergence term, a term describing the ability to perform well if future is known, and a generalization of sequential Rademacher complexity, studied in [25]. Since we directly study complexity of the problem instead of focusing on efficient algorithms, we are able to improve and extend many known results which have been previously derived via an algorithmic construction. 1
Blackwell approachability and low-regret learning are equivalent
- In Proceedings of the 25th Annual Conference on Learning Theory (COLT ’12
, 2012
"... We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. We show that Blackwell’s result is equivalent, via efficient reductions, to the existence of “no-regret ” algorithms for Online Linear Optimization. Indeed, we show that any algorithm for one such ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
(Show Context)
We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. We show that Blackwell’s result is equivalent, via efficient reductions, to the existence of “no-regret ” algorithms for Online Linear Optimization. Indeed, we show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide a useful application of this reduction: the first efficient algorithm for calibrated forecasting. 1 ar X iv
Unified Algorithms for Online Learning and Competitive Analysis
- 25TH ANNUAL CONFERENCE ON LEARNING THEORY
, 2012
"... Online learning and competitive analysis are two widely studied frameworks for online decision-making settings. Despite the frequent similarity of the problems they study, there are significant differences in their assumptions, goals and techniques, hindering a unified analysis and richer interplay ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Online learning and competitive analysis are two widely studied frameworks for online decision-making settings. Despite the frequent similarity of the problems they study, there are significant differences in their assumptions, goals and techniques, hindering a unified analysis and richer interplay between the two. In this paper, we provide several contributions in this direction. We provide a single unified algorithm which by parameter tuning, interpolates between optimal regret for learning from experts (in online learning) and optimal competitive ratio for the metrical task systems problem (MTS) (in competitive analysis), improving on the results of Blum and Burch (1997). The algorithm also allows us to obtain new regret bounds against “drifting” experts, which might be of independent interest. Moreover, our approach allows us to go beyond experts/MTS, obtaining similar unifying results for structured action sets and “combinatorial experts”, whenever the setting has a certain matroid structure.
Online (budgeted) social choice
- In Proceedings of AAAI-2014, 1456–1462
, 2014
"... Abstract We consider a classic social choice problem in an online setting. In each round, a decision maker observes a single agent's preferences over a set of m candidates, and must choose whether to irrevocably add a candidate to a selection set of limited cardinality k. Each agent's (po ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Abstract We consider a classic social choice problem in an online setting. In each round, a decision maker observes a single agent's preferences over a set of m candidates, and must choose whether to irrevocably add a candidate to a selection set of limited cardinality k. Each agent's (positional) score depends on the candidates in the set when he arrives, and the decisionmaker's goal is to maximize average (over all agents) score. We prove that no algorithm (even randomized) can achieve an approximation factor better than O( log log m log m ). In contrast, if the agents arrive in random order, we present a (1− 1 e −o(1))-approximate algorithm, matching a lower bound for the offline problem. We show that improved performance is possible for natural input distributions or scoring rules. Finally, if the algorithm is permitted to revoke decisions at a fixed cost, we apply regret-minimization techniques to achieve approximation 1 − 1 e − o(1) even for arbitrary inputs.
Blackwell Approachability and No-Regret Learning are Equivalent
"... We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “no-regret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “no-regret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimization for Online Linear Optimization. We show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide one novel application of this reduction: the first efficient algorithm for calibrated forecasting. 1.
Regret Minimization and Job Scheduling
"... Abstract. Regret minimization has proven to be a very powerful tool in both computational learning theory and online algorithms. Regret minimization algorithms can guarantee, for a single decision maker, a near optimal behavior under fairly adversarial assumptions. I will discuss a recent extensions ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Regret minimization has proven to be a very powerful tool in both computational learning theory and online algorithms. Regret minimization algorithms can guarantee, for a single decision maker, a near optimal behavior under fairly adversarial assumptions. I will discuss a recent extensions of the classical regret minimization model, which enable to handle many different settings related to job scheduling, and guarantee the near optimal online behavior. 1 Regret Minimization Consider a single decision maker attempting to optimize it performance in face of an uncertain environment. This simple online setting has attracted attention from multiple disciplines, including operations research, game theory, and computer science. In computer science, computational learning theory and online algorithms both focus on this task from different perspectives. I will concentrate only on a certain facet of this general issue of decision making, and consider settings related to regret minimization, where the performance of the online decision
Learning with Global Cost in Stochastic Environments
"... We consider an online learning setting where at each time step the decision maker has to choose how to distribute the future loss between k alternatives, and then observes the loss of each alternative, where the losses are assumed to come from a joint distribution. Motivated by load balancing and jo ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
We consider an online learning setting where at each time step the decision maker has to choose how to distribute the future loss between k alternatives, and then observes the loss of each alternative, where the losses are assumed to come from a joint distribution. Motivated by load balancing and job scheduling, we consider a global cost function (over the losses incurred by each alternative), rather than a summation of the instantaneous losses as done traditionally in online learning. Specifically, we consider the global cost functions: (1) the makespan (the maximum over the alternatives) and (2) the Ld norm (over the alternatives) for d> 1. We design algorithms that guarantee logarithmic regret for this setting, where the regret is measured with respect to the best static decision (one selects the same distribution over alternatives at every time step). We also show that the least loaded machine, a natural algorithm for minimizing the makespan, has a regret of the order of √ T. We complement our theoretical findings with supporting experimental results. 1
24th Annual Conference on Learning Theory Online Learning: Beyond Regret
"... We study online learnability of a wide class of problems, extending the results of Rakhlin et al. (2010a) to general notions of performance measure well beyond external regret. Our framework simultaneously captures such well-known notions as internal and general Φ-regret, learning with non-additive ..."
Abstract
- Add to MetaCart
We study online learnability of a wide class of problems, extending the results of Rakhlin et al. (2010a) to general notions of performance measure well beyond external regret. Our framework simultaneously captures such well-known notions as internal and general Φ-regret, learning with non-additive global cost functions, Blackwell’s approachability, calibration of forecasters, and more. We show that learnability in all these situations is due to control of the same three quantities: a martingale convergence term, a term describing the ability to perform well if future is known, and a generalization of sequential Rademacher complexity, studied in Rakhlin et al. (2010a). Since we directly study complexity of the problem instead of focusing on efficient algorithms, we are able to improve and extend many known results which have been previously derived via an algorithmic construction. 1.
Approachability in unknown games: Online learning meets multi-objective optimization
"... In the standard setting of approachability there are two players and a target set. The players play a repeated vector-valued game where one of them wants to have the average vector-valued payoff converge to the target set which the other player tries to exclude. We revisit the classical setting and ..."
Abstract
- Add to MetaCart
(Show Context)
In the standard setting of approachability there are two players and a target set. The players play a repeated vector-valued game where one of them wants to have the average vector-valued payoff converge to the target set which the other player tries to exclude. We revisit the classical setting and consider the setting where the player has a preference relation between target sets: she wishes to approach the smallest (“best”) set possible given the observed average payoffs in hindsight. Moreover, as opposed to previous works on approachability, and in the spirit of online learning, we do not assume that there is a known game structure with actions for two players. Rather, the player receives an arbitrary vector-valued reward vector at every round. We show that it is impossible, in general, to approach the best target set in hindsight. We further propose a concrete strategy that approaches a non-trivial relaxation of the best-in-hindsight given the actual rewards. Our approach does not require projection onto a target set and amounts to switching between scalar regret minimization algorithms that are performed in episodes.