Results

**21 - 27**of**27**### Label optimal regret bounds for online local learning

, 2015

"... Abstract We resolve an open question from Christiano (2014b) posed in COLT'14 regarding the optimal dependency of the regret achievable for online local learning on the size of the label set. In this framework, the algorithm is shown a pair of items at each step, chosen from a set of n items. ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract We resolve an open question from Christiano (2014b) posed in COLT'14 regarding the optimal dependency of the regret achievable for online local learning on the size of the label set. In this framework, the algorithm is shown a pair of items at each step, chosen from a set of n items. The learner then predicts a label for each item, from a label set of size L and receives a real valued payoff. This is a natural framework which captures many interesting scenarios such as online gambling and online max cut. Christiano (2014a) designed an efficient online learning algorithm for this problem achieving a regret of O( where T is the number of rounds. Information theoretically, one can achieve a regret of O( √ n log LT ). One of the main open questions left in this framework concerns closing the above gap. In this work, we provide a complete answer to the question above via two main results. We show, via a tighter analysis, that the semi-definite programming based algorithm of Christiano (2014a) in fact achieves a regret of O( √ nLT ). Second, we show a matching computational lower bound. Namely, we show that a polynomial time algorithm for online local learning with lower regret would imply a polynomial time algorithm for the planted clique problem which is widely believed to be hard. We prove a similar hardness result under a related conjecture concerning planted dense subgraphs that we put forth. Unlike planted clique, the planted dense subgraph problem does not have any known quasi-polynomial time algorithms. Computational lower bounds for online learning are relatively rare, and we hope that the ideas developed in this work will lead to lower bounds for other online learning scenarios as well.

### Effective Network Management via System-Wide Coordination and Optimization

, 2010

"... As networked systems grow and traffic patterns evolve, management applications are increasing in complexity and functionality. To address the requirements of these management applications, equipment vendors and administrators today depend on incremental solutions that increase the complexity of net ..."

Abstract
- Add to MetaCart

As networked systems grow and traffic patterns evolve, management applications are increasing in complexity and functionality. To address the requirements of these management applications, equipment vendors and administrators today depend on incremental solutions that increase the complexity of network elements and deployment costs for operators. Despite this increased complexity and cost, the incremental nature of these solutions still leaves a significant gap between the policy objectives of system administrators and today’s mechanisms. These challenges arise in several application contexts in different networking domains: ISPs, enterprise settings, and data centers. Much of this disconnect arises from the narrow device-centric view of current solutions. Such piecemeal solutions are inefficient: network elements duplicate tasks and some locations become overloaded. Worse still, administrators struggle to retrofit their high-level goals within device-centric configurations. This

### The Power of Uncertainty: Algorithmic Mechanism Design in Settings of Incomplete Information

, 2011

"... The field of algorithmic mechanism design is concerned with the design of computationally efficient algorithms for use when inputs are provided by rational agents, who may misreport their private values in order to strategically manipulate the algorithm for their own benefit. We revisit classic prob ..."

Abstract
- Add to MetaCart

The field of algorithmic mechanism design is concerned with the design of computationally efficient algorithms for use when inputs are provided by rational agents, who may misreport their private values in order to strategically manipulate the algorithm for their own benefit. We revisit classic problems in this field by considering settings of incomplete information, where the players ’ private values are drawn from publicly-known distributions. Such Bayesian models of partial information are common in economics, but have been largely unexplored by the computer science community. In the first part of this thesis we show that, for a very broad class of single-parameter problems, any computationally efficient algorithm can be converted without loss into a mechanism that is truthful in the Bayesian sense of partial information. That is, we exhibit a transformation that generates mechanisms for which it is in each agent’s best (expected) interest to refrain from strategic manipulation. The problem of constructing mechanisms for use by rational agents therefore reduces to the design of approximation algorithms without consideration of game-theoretic issues. We furthermore prove that

### A Learning Perspective on Selfish Behavior in Games

, 2009

"... Computer systems increasingly involve the interaction of multiple self-interested agents. The designers of these systems have objectives they wish to optimize, but by allowing selfish agents to interact in the system, they lose the ability to directly control behavior. What is lost by this lack of ..."

Abstract
- Add to MetaCart

Computer systems increasingly involve the interaction of multiple self-interested agents. The designers of these systems have objectives they wish to optimize, but by allowing selfish agents to interact in the system, they lose the ability to directly control behavior. What is lost by this lack of centralized control? What are the likely outcomes of selfish behavior? In this work, we consider learning dynamics as a tool for better classifying and understanding outcomes of selfish behavior in games. In particular, when such learning algorithms exist and are efficient, we propose “regret-minimization” as a criterion for self-interested behavior and study the system-wide effects in broad classes of games when players achieve this criterion. In addition, we present a general transformation from offline approximation algorithms for linear optimization problems to online algorithms that achieve low regret.

### Multi-Armed Bandits in Metric Spaces ∗

, 2008

"... In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of n trials so as to maximize the total payoff of the chosen strategies. While the performance of bandit algorithms with a small finite strategy set is quite well understood, bandit problems with larg ..."

Abstract
- Add to MetaCart

(Show Context)
In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of n trials so as to maximize the total payoff of the chosen strategies. While the performance of bandit algorithms with a small finite strategy set is quite well understood, bandit problems with large strategy sets are still a topic of very active investigation, motivated by practical applications such as online auctions and web advertisement. The goal of such research is to identify broad and natural classes of strategy sets and payoff functions which enable the design of efficient solutions. In this work we study a very general setting for the multi-armed bandit problem in which the strategies form a metric space, and the payoff function satisfies a Lipschitz condition with respect to the metric. We refer to this problem as the Lipschitz MAB problem. We present a complete solution for the multi-armed problem in this setting. That is, for every metric space (L, X) we define an isometry invariant MaxMinCOV(X) which bounds from below the performance of Lipschitz MAB algorithms for X, and we present an algorithm which comes arbitrarily close to meeting this bound. Furthermore, our technique gives even better results for benign payoff functions. 1

### 1. Proof of Lemma 1

"... tained by the greedy algorithm). Then ‖gt ‖ ≤ βmaxA⊆E |ft(A) | − ft(∅), where β = 1 if ft is non-decreasing, and β = 3 otherwise. Proof. Since t is fixed, we will drop the subscript in this proof. Essential for the proof is that g ∈ Pf, in fact, it lies in the base polytope (Fujishige, 2005). This ..."

Abstract
- Add to MetaCart

tained by the greedy algorithm). Then ‖gt ‖ ≤ βmaxA⊆E |ft(A) | − ft(∅), where β = 1 if ft is non-decreasing, and β = 3 otherwise. Proof. Since t is fixed, we will drop the subscript in this proof. Essential for the proof is that g ∈ Pf, in fact, it lies in the base polytope (Fujishige, 2005). This means that g · χA ≤ f(A) (1) for all A ⊆ E. Assume first that f is nonnegative and nondecreasing. Then Equation (1) immediately leads to a bound on ‖g‖, by bounding the `2 norm by the `1 norm: ‖g‖2 ≤ ‖g‖1 = g · χE ≤ f(E). (2) This proves the lemma for nondecreasing functions. For arbitrary submodular functions, we use the con-struction of g in slightly more detail, but the basic arguments are the same. For ease of notation, let γ = maxA⊆E |f(A)|. We first recall how g was con-structed, given x ≥ 0. We denote the components of x by xi, 1 ≤ i ≤ m. We find a permutation pi such that xpi(1) ≥ xpi(2) ≥... ≥ xpi(m). This ordering induces a maximal chain of sets, ∅ = A0 ⊂ A1 ⊂... ⊂ Am with A0 = ∅ and Ai = Ai−1 ∪ {epi(i)}. Setting gpi(i) = f(Ai) − f(Ai−1) (3) yields g, with g · χAi = f(Ai) − f(∅). Let g+ = max{g, 0} be the element-wise maximum.