Results 1  10
of
10
Beyond the regret minimization barrier: an optimal algorithm for stochastic stronglyconvex optimization
 In Proceedings of the 24th Annual Conference on Learning Theory, volume 19 of JMLR Workshop and Conference Proceedings
, 2011
"... We give a novel algorithm for stochastic stronglyconvex optimization in the gradient oracle model which returns an O ( 1 T)approximate solution after T gradient updates. This rate of convergence is optimal in the gradient oracle model. This improves upon the previously log(T) known best rate of O( ..."
Abstract

Cited by 58 (3 self)
 Add to MetaCart
We give a novel algorithm for stochastic stronglyconvex optimization in the gradient oracle model which returns an O ( 1 T)approximate solution after T gradient updates. This rate of convergence is optimal in the gradient oracle model. This improves upon the previously log(T) known best rate of O( T), which was obtained by applying an online stronglyconvex optimization algorithm with regret O(log(T)) to the batch setting. We complement this result by proving that any algorithm has expected regret of Ω(log(T)) in the online stochastic stronglyconvex optimization setting. This lower bound holds even in the fullinformation setting which reveals more information to the algorithm than just gradients. This shows that any onlinetobatch conversion is inherently suboptimal for stochastic stronglyconvex optimization. This is the first formal evidence that online convex optimization is strictly more difficult than batch stochastic convex optimization. 1
A Stochastic View of Optimal Regret through Minimax Duality
"... We study the regret of optimal strategies for online convex optimization games. Using von Neumann’s minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to ..."
Abstract

Cited by 47 (21 self)
 Add to MetaCart
(Show Context)
We study the regret of optimal strategies for online convex optimization games. Using von Neumann’s minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary’s action sequence, of the difference between a sum of minimal expected losses and the minimal empirical loss. We show that the optimal regret has a natural geometric interpretation, since it can be viewed as the gap in Jensen’s inequality for a concave functional—the minimizer over the player’s actions of expected loss—defined on a set of probability distributions. We use this expression to obtain upper and lower bounds on the regret of an optimal strategy for a variety of online learning problems. Our method provides upper bounds without the need to construct a learning algorithm; the lower bounds provide explicit optimal strategies for the adversary. 1
The LastStep Minimax Algorithm
 Pages 279 290 of: Proc. 11th International Conference on Algorithmic Learning Theory
, 2000
"... We consider online density estimation with a parameterized density from an exponential family. In each trial t the learner predicts a parameter t . Then it receives an instance x t chosen by the adversary and incurs loss ln p(x t j t ) which is the negative loglikelihood of x t w.r.t. the predict ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
(Show Context)
We consider online density estimation with a parameterized density from an exponential family. In each trial t the learner predicts a parameter t . Then it receives an instance x t chosen by the adversary and incurs loss ln p(x t j t ) which is the negative loglikelihood of x t w.r.t. the predicted density of the learner. The performance of the learner is measured by the regret dened as the total loss of the learner minus the total loss of the best parameter chosen oline. We develop an algorithm called the Laststep Minimax Algorithm that predicts with the minimax optimal parameter assuming that the current trial is the last one. For onedimensional exponential families, we give an explicit form of the prediction of the Laststep Minimax Algorithm and show that its regret is O(ln T ), where T is the number of trials. In particular, for Bernoulli density estimation the Laststep Minimax Algorithm is slightly better than the standard Laplace estimator. This work was done while...
Representation and Learning in Computational Game Theory
, 2003
"... Game theory has emerged as the key tool for understanding and designing complex multiagent environments such as the Internet, systems of autonomous agents, and electronic communities or economies. To support these relatively recent uses of game theory — as well as for more ambitious modeling in some ..."
Abstract
 Add to MetaCart
Game theory has emerged as the key tool for understanding and designing complex multiagent environments such as the Internet, systems of autonomous agents, and electronic communities or economies. To support these relatively recent uses of game theory — as well as for more ambitious modeling in some of the traditional application areas — there is a growing need for a computational theory that is largely absent from classical game theory research. Such a computational theory needs to provide a rich and flexible collection of models and representations for complex gametheoretic problems; powerful and efficient algorithms for manipulating and learning these models; and a deep understanding of the algorithmic and resource issues arising in all aspects of game theory. The overarching goal of the proposed work is to “scale up ” the applicability of game theory, in much the same way that Bayesian networks and associated advances made complex, highdimensional probabilistic modeling possible in a wide set of applications in computer science and beyond. Two of the most important topics that have materialized to date — and the primary emphases of the current proposal — are the representation and efficient manipulation of large and complex games, and new approaches to learning in gametheoretic settings. On the topic of representation, the proposal includes the development of methods to model structured interaction in largepopulation games; the intersection of social network theory and game theory; new representations in repeated games; and representational issues for a
Minimax Time Series Prediction
"... Abstract We consider an adversarial formulation of the problem of predicting a time series with square loss. The aim is to predict an arbitrary sequence of vectors almost as well as the best smooth comparator sequence in retrospect. Our approach allows natural measures of smoothness such as the squ ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We consider an adversarial formulation of the problem of predicting a time series with square loss. The aim is to predict an arbitrary sequence of vectors almost as well as the best smooth comparator sequence in retrospect. Our approach allows natural measures of smoothness such as the squared norm of increments. More generally, we consider a linear time series model and penalize the comparator sequence through the energy of the implied driving noise terms. We derive the minimax strategy for all problems of this type and show that it can be implemented efficiently. The optimal predictions are linear in the previous observations. We obtain an explicit expression for the regret in terms of the parameters defining the problem. For typical, simple definitions of smoothness, the computation of the optimal predictions involves only sparse matrices. In the case of normconstrained data, where the smoothness is defined in terms of the squared norm of the comparator's increments, we show that the regret grows as T / √ λ T , where T is the length of the game and λ T is an increasing limit on comparator smoothness.
Abstract
, 2006
"... We develop a new collaborative filtering (CF) method that combines both previously known users ’ preferences, i.e. standard CF, as well as product/user attributes, i.e. classical function approximation, to predict a given user’s interest in a particular product. Our method is a generalized low rank ..."
Abstract
 Add to MetaCart
(Show Context)
We develop a new collaborative filtering (CF) method that combines both previously known users ’ preferences, i.e. standard CF, as well as product/user attributes, i.e. classical function approximation, to predict a given user’s interest in a particular product. Our method is a generalized low rank matrix completion problem, where we learn a function whose inputs are pairs of vectors – the standard low rank matrix completion problem being a special case where the inputs to the function are the row and column indices of the matrix. We solve this generalized matrix completion problem using tensor product kernels for which we also formally generalize standard kernel properties. Benchmark experiments on movie ratings show the advantages of our generalized matrix completion method over the standard matrix completion one with no information about movies or people, as well as over standard multitask or single task learning methods. 1
The price of bandit information for online optimization
"... In the online linear optimization problem, a learner must choose, in each round, a decision from a set D ⊂ R n in order to minimize an (unknown and changing) linear cost function. We present sharp rates of convergence (with respect to additive regret) for both the full information setting (where the ..."
Abstract
 Add to MetaCart
In the online linear optimization problem, a learner must choose, in each round, a decision from a set D ⊂ R n in order to minimize an (unknown and changing) linear cost function. We present sharp rates of convergence (with respect to additive regret) for both the full information setting (where the cost function is revealed at the end of each round) and in the bandit setting (where only the scalar cost incurred is revealed). In particular, this paper is concerned with the price of bandit information — how much worse the regret is in the bandit case as compared to the full information case. For the full information case, the upper bound on the regret is O ∗ ( √ nT), where n is the ambient dimension and T is the time horizon. For the bandit case, we present an algorithm which achieves O ∗ (n 3/2 √ T) regret — all previous (nontrivial) bounds here were O(poly(n)T 2/3) or worse. It is striking that the convergence rate for the bandit setting is only a factor of n worse than in the full information case — in stark contrast to the Karm bandit setting, where the gap in the dependence on K is exponential ( √ T K vs. √ T log K). We also present lower bounds showing that this gap is at least √ n, which we conjecture to be the correct order. The bandit algorithm we present can be implemented efficiently in special cases of particular interest, such as path planning and Markov Decision Problems.
Online estimation with the multivariate Gaussian distribution
"... Abstract. We consider online density estimation with the multivariate Gaussian distribution. In each of a sequence of trials, the learner must posit a mean µ and covariance Σ; the learner then receives an instance x and incurs loss equal to the negative loglikelihood of x under the Gaussian densit ..."
Abstract
 Add to MetaCart
Abstract. We consider online density estimation with the multivariate Gaussian distribution. In each of a sequence of trials, the learner must posit a mean µ and covariance Σ; the learner then receives an instance x and incurs loss equal to the negative loglikelihood of x under the Gaussian density parameterized by (µ, Σ). We prove bounds on the regret for the followtheleader strategy, which amounts to choosing the sample mean and covariance of the previously seen data. 1
Minimax Regret Classifier for Imprecise Class Distributions
"... The design of a minimum risk classifier based on data usually stems from the stationarity assumption that the conditions during training and test are the same: the misclassification costs assumed during training must be in agreement with real costs, and the same statistical process must have generat ..."
Abstract
 Add to MetaCart
(Show Context)
The design of a minimum risk classifier based on data usually stems from the stationarity assumption that the conditions during training and test are the same: the misclassification costs assumed during training must be in agreement with real costs, and the same statistical process must have generated both training and test data. Unfortunately, in real world applications, these assumptions may not hold. This paper deals with the problem of training a classifier when prior probabilities cannot be reliably induced from training data. Some strategies based on optimizing the worst possible case (conventional minimax) have been proposed previously in the literature, but they may achieve a robust classification at the expense of a severe performance degradation. In this paper we propose a minimax regret (minimax deviation) approach, that seeks to minimize the maximum deviation from the performance of the optimal risk classifier. A neuralbased minimax regret classifier for general multiclass decision problems is presented. Experimental results show its robustness and the advantages in relation to other approaches.