Results 1 -
2 of
2
Logarithmic regret algorithms for online convex optimization
- In 19’th COLT
, 2006
"... Abstract. In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., choose a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters an sequence of (possibly unrelated) convex cost functions. Zinkevich [Zin03] i ..."
Abstract
-
Cited by 65 (19 self)
- Add to MetaCart
Abstract. In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., choose a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters an sequence of (possibly unrelated) convex cost functions. Zinkevich [Zin03] introduced this framework, which models many natural repeated decision-making problems and generalizes many existing problems such as Prediction from Expert Advice and Cover’s Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret O ( √ T), for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight. In this paper, we give algorithms that achieve regret O(log(T)) for an arbitrary sequence of strictly convex functions (with bounded first and second derivatives). This mirrors what has been done for the special cases of prediction from expert advice by Kivinen and Warmuth [KW99], and Universal Portfolios by Cover [Cov91]. We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement. The main new ideas give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field. Our analysis shows a surprising connection to follow-the-leader method, and builds on the recent work of Agarwal and Hazan [AH05]. We also analyze other algorithms, which tie together several different previous approaches including follow-the-leader, exponential weighting, Cover’s algorithm and gradient descent. 1
Efficient algorithms for online game playing and universal portfolio management
, 2005
"... We introduce a new algorithm and a new analysis technique that is applicable to a variety of online optimization scenarios, including regret minimization for Lipschitz regret functions, universal portfolio management, online convex optimization and online utility maximization. In addition to being m ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We introduce a new algorithm and a new analysis technique that is applicable to a variety of online optimization scenarios, including regret minimization for Lipschitz regret functions, universal portfolio management, online convex optimization and online utility maximization. In addition to being more efficient and deterministic, our algorithm applies to a more general setting (e.g. when the payoff function is unknown). For the general online game playing setting it is the first to attain logarithmic regret, as opposed to previous algorithms attaining polynomial regret. The algorithm extends a natural online method studied in the 1950’s, called “follow the leader”, thus answering in the affirmative a conjecture about universal portfolios made by Cover and Ordentlich and independently by Kalai and Vempala. The techniques also leads to derandomization of an algorithm by Hannan, and Kalai and Vempala. Our analysis shows a surprising connection between interior point methods and online optimization by using the follow the leader method.

