Results 1  10
of
186
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
, 2010
"... Stochastic subgradient methods are widely used, well analyzed, and constitute effective tools for optimization and online learning. Stochastic gradient methods ’ popularity and appeal are largely due to their simplicity, as they largely follow predetermined procedural schemes. However, most common s ..."
Abstract

Cited by 287 (3 self)
 Add to MetaCart
Stochastic subgradient methods are widely used, well analyzed, and constitute effective tools for optimization and online learning. Stochastic gradient methods ’ popularity and appeal are largely due to their simplicity, as they largely follow predetermined procedural schemes. However, most common subgradient approaches are oblivious to the characteristics of the data being observed. We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradientbased learning. The adaptation, in essence, allows us to find needles in haystacks in the form of very predictive but rarely seenfeatures. Ourparadigmstemsfromrecentadvancesinstochasticoptimizationandonlinelearning which employ proximal functions to control the gradient steps of the algorithm. We describe and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal function that can be chosen in hindsight. In a companion paper, we validate experimentally our theoretical analysis and show that the adaptive subgradient approach outperforms stateoftheart, but nonadaptive, subgradient algorithms. 1
The multiplicative weights update method: a meta algorithm and applications
, 2005
"... Algorithms in varied fields use the idea of maintaining a distribution over a certain set and use the multiplicative update rule to iteratively change these weights. Their analysis are usually very similar and rely on an exponential potential function. We present a simple meta algorithm that unifies ..."
Abstract

Cited by 146 (14 self)
 Add to MetaCart
(Show Context)
Algorithms in varied fields use the idea of maintaining a distribution over a certain set and use the multiplicative update rule to iteratively change these weights. Their analysis are usually very similar and rely on an exponential potential function. We present a simple meta algorithm that unifies these disparate algorithms and drives them as simple instantiations of the meta algorithm. 1
Nearly tight bounds for the continuumarmed bandit problem
 Advances in Neural Information Processing Systems 17
, 2005
"... In the multiarmed bandit problem, an online algorithm must choose from a set of strategies in a sequence of n trials so as to minimize the total cost of the chosen strategies. While nearly tight upper and lower bounds are known in the case when the strategy set is finite, much less is known when th ..."
Abstract

Cited by 121 (7 self)
 Add to MetaCart
(Show Context)
In the multiarmed bandit problem, an online algorithm must choose from a set of strategies in a sequence of n trials so as to minimize the total cost of the chosen strategies. While nearly tight upper and lower bounds are known in the case when the strategy set is finite, much less is known when there is an infinite strategy set. Here we consider the case when the set of strategies is a subset of R d, and the cost functions are continuous. In the d = 1 case, we improve on the bestknown upper and lower bounds, closing the gap to a sublogarithmic factor. We also consider the case where d> 1 and the cost functions are convex, adapting a recent online convex optimization algorithm of Zinkevich to the sparser feedback model of the multiarmed bandit problem. 1
Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling
 IEEE TRANSACTIONS ON AUTOMATIC CONTROL
, 2010
"... The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. It arises in various application domains, including distributed tracking and localization, multiagen ..."
Abstract

Cited by 93 (12 self)
 Add to MetaCart
(Show Context)
The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. It arises in various application domains, including distributed tracking and localization, multiagent coordination, estimation in sensor networks, and largescale machine learning. We develop and analyze distributed algorithms based on dual subgradient averaging, and we provide sharp bounds on their convergence rates as a function of the network size and topology. Our analysis allows us to clearly separate the convergence of the optimization algorithm itself and the effects of communication dependent on the network structure. We show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network and confirm this prediction’s sharpness both by theoretical lower bounds and simulations for various networks. Our approach includes the cases of deterministic optimization and communication as well as problems with stochastic optimization and/or communication.
Approximation algorithms and online mechanisms for item pricing
 In Proceedings of the 7th ACM Conference on Electronic Commerce
, 2006
"... Abstract: We present approximation and online algorithms for problems of pricing a collection of items for sale so as to maximize the seller’s revenue in an unlimited supply setting. Our first result is an O(k)approximation algorithm for pricing items to singleminded bidders who each want at most ..."
Abstract

Cited by 78 (11 self)
 Add to MetaCart
(Show Context)
Abstract: We present approximation and online algorithms for problems of pricing a collection of items for sale so as to maximize the seller’s revenue in an unlimited supply setting. Our first result is an O(k)approximation algorithm for pricing items to singleminded bidders who each want at most k items. This improves over work of Briest and Krysta (2006) who achieve an O(k2) bound. For the case k = 2, where we obtain a 4approximation, this can be viewed as the following graph vertex pricing problem: given a (multi) graph G with valuations wi j on the edges, find prices pi ≥ 0 for the vertices to maximize {(i, j):wi j≥pi+p j} (pi + p j). We also improve the approximation of Guruswami et al. (2005) for the “highway problem” in which all desired subsets are intervals on a line, from O(logm+ logn) to O(logn), where m is the number of bidders and n is the number of items. Our approximation algorithms can
Differential privacy under continual observation
 In STOC
, 2010
"... Differential privacy is a recent notion of privacy tailored to privacypreserving data analysis [10]. Up to this point, research on differentially private data analysis has focused on the setting of a trusted curator holding a large, static, data set; thus every computation is a “oneshot ” object: ..."
Abstract

Cited by 63 (2 self)
 Add to MetaCart
(Show Context)
Differential privacy is a recent notion of privacy tailored to privacypreserving data analysis [10]. Up to this point, research on differentially private data analysis has focused on the setting of a trusted curator holding a large, static, data set; thus every computation is a “oneshot ” object: there is no point in computing something twice, since the result will be unchanged, up to any randomness introduced for privacy. However, many applications of data analysis involve repeated computations, either because the entire goal is one of monitoring, e.g., of traffic conditions, search trends, or incidence of influenza, or because the goal is some kind of adaptive optimization, e.g., placement of data to minimize access costs. In these cases, the algorithm must permit continual observation of the system’s state. We therefore initiate a study of differential privacy under continual observation. We identify the problem of maintaining a counter in a privacy preserving manner and show its wide applicability to many different problems.
Routing without regret: On convergence to nash equilibria of regretminimizing algorithms in routing games
 In PODC
, 2006
"... Abstract There has been substantial work developing simple, efficient noregret algorithms for a wideclass of repeated decisionmaking problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversariallychanging envi ..."
Abstract

Cited by 59 (6 self)
 Add to MetaCart
(Show Context)
Abstract There has been substantial work developing simple, efficient noregret algorithms for a wideclass of repeated decisionmaking problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversariallychanging environments. There has also been substantial work on analyzing properties of Nash equilibria in routing games. In this paper, we consider the question: if each player in a routing game uses a noregret strategy, will behavior converge to a Nash equilibrium? In general games the answer to this question is known to be no in a strong sense, but routing games havesubstantially more structure. In this paper we show that in the Wardrop setting of multicommodity flow and infinitesimalagents, behavior will approach Nash equilibrium (formally, on most days, the cost of the flow will be close to the cost of the cheapest paths possible given that flow) at a rate that dependspolynomially on the players ' regret bounds and the maximum slope of any latency function. We also show that priceofanarchy results may be applied to these approximate equilibria, and alsoconsider the finitesize (noninfinitesimal) loadbalancing model of Azar [2].
NearOptimal Online Auctions
 In Proceedings of the 16th Annual ACMSIAM Symposium on Discrete Algorithms
, 2005
"... Abstract We consider the online auction problem proposed byBarYossef, Hildrum, and Wu [4] in which an auctioneer is selling identical items to bidders arriving one at atime. We give an auction that achieves a constant factor of the optimal profit less an O(h) additive loss term,where h is the value ..."
Abstract

Cited by 51 (10 self)
 Add to MetaCart
(Show Context)
Abstract We consider the online auction problem proposed byBarYossef, Hildrum, and Wu [4] in which an auctioneer is selling identical items to bidders arriving one at atime. We give an auction that achieves a constant factor of the optimal profit less an O(h) additive loss term,where h is the value of the highest bid. Furthermore,this auction does not require foreknowledge of the range of bidders ' valuations. On both counts, this answersopen questions from [4, 5]. We further improve on the results from [5] for the online postedprice problem by reducing their additive loss term from O(h log h log log h)to O(h log log h). Finally, we define the notion of an(offline) attribute auction for modeling the problem of auctioning items to consumers who are not apriori indistinguishable. We apply our online auction solution to achieve good bounds for the attribute auction problemwith 1dimensional attributes.
A new understanding of prediction markets via noregret learning
 In ACM EC
, 2010
"... We explore the striking mathematical connections that exist between market scoring rules, cost function based prediction markets, and noregret learning. We first show that any cost function based prediction market can be interpreted as an algorithm for the commonly studied problem of learning from ..."
Abstract

Cited by 47 (11 self)
 Add to MetaCart
We explore the striking mathematical connections that exist between market scoring rules, cost function based prediction markets, and noregret learning. We first show that any cost function based prediction market can be interpreted as an algorithm for the commonly studied problem of learning from expert advice by equating the set of outcomes on which bets are placed in the market with the set of experts in the learning setting, and equating trades made in the market with losses observed by the learning algorithm. If the loss of the market organizer is bounded, this bound can be used to derive an O ( √ T) regret bound for the corresponding learning algorithm. We then show that the class of markets with convex cost functions exactly corresponds to the class of Follow the Regularized Leader learning algorithms, with the choice of a cost function in the market corresponding to the choice of a regularizer in the learning problem. Finally, we show an equivalence between market scoring rules and prediction markets with convex cost functions. This implies both that any market scoring rule can be implemented as a cost function based market maker, and that market scoring rules can be interpreted naturally as Follow the Regularized Leader algorithms. These connections provide new insight into how it is that commonly studied markets, such as the Logarithmic Market Scoring Rule, can aggregate opinions into accurate estimates of the likelihood of future events.
Combinatorial Bandits
"... We study sequential prediction problems in which, at each time instance, the forecaster chooses a binary vector from a certain fixed set S ⊆ {0, 1} d and suffers a loss that is the sum of the losses of those vector components that equal to one. The goal of the forecaster is to achieve that, in the l ..."
Abstract

Cited by 46 (7 self)
 Add to MetaCart
(Show Context)
We study sequential prediction problems in which, at each time instance, the forecaster chooses a binary vector from a certain fixed set S ⊆ {0, 1} d and suffers a loss that is the sum of the losses of those vector components that equal to one. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible vector in the class. We consider the “bandit ” setting in which the forecaster has only access to the losses of the chosen vectors. We introduce a new general forecaster achieving a regret bound that, for a variety of concrete choices of S, is of order √ nd ln S  where n is the time horizon. This is not improvable in general and is better than previously known bounds. We also point out that computationally efficient implementations for various interesting choices of S exist. 1