Results 1  10
of
179
A DecisionTheoretic Generalization of onLine Learning and an Application to Boosting
, 1996
"... ..."
Logarithmic regret algorithms for online convex optimization
 In 19’th COLT
, 2006
"... Abstract. In an online convex optimization problem a decisionmaker makes a sequence of decisions, i.e., choose a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters an sequence of (possibly unrelated) convex cost functions. Zinkevich [Zin03] i ..."
Abstract

Cited by 166 (31 self)
 Add to MetaCart
(Show Context)
Abstract. In an online convex optimization problem a decisionmaker makes a sequence of decisions, i.e., choose a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters an sequence of (possibly unrelated) convex cost functions. Zinkevich [Zin03] introduced this framework, which models many natural repeated decisionmaking problems and generalizes many existing problems such as Prediction from Expert Advice and Cover’s Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret O ( √ T), for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight. In this paper, we give algorithms that achieve regret O(log(T)) for an arbitrary sequence of strictly convex functions (with bounded first and second derivatives). This mirrors what has been done for the special cases of prediction from expert advice by Kivinen and Warmuth [KW99], and Universal Portfolios by Cover [Cov91]. We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement. The main new ideas give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field. Our analysis shows a surprising connection to followtheleader method, and builds on the recent work of Agarwal and Hazan [AH05]. We also analyze other algorithms, which tie together several different previous approaches including followtheleader, exponential weighting, Cover’s algorithm and gradient descent. 1
Efficient Algorithms for Online Decision Problems
 J. Comput. Syst. Sci
, 2003
"... In an online decision problem, one makes a sequence of decisions without knowledge of the future. Tools from learning such as Weighted Majority and its many variants [13, 18, 4] demonstrate that online algorithms can perform nearly as well as the best single decision chosen in hindsight, even when t ..."
Abstract

Cited by 162 (3 self)
 Add to MetaCart
In an online decision problem, one makes a sequence of decisions without knowledge of the future. Tools from learning such as Weighted Majority and its many variants [13, 18, 4] demonstrate that online algorithms can perform nearly as well as the best single decision chosen in hindsight, even when there are exponentially many possible decisions. However, the naive application of these algorithms is inefficient for such large problems. For some problems with nice structure, specialized efficient solutions have been developed [10, 16, 17, 6, 3].
Relative Loss Bounds for Online Density Estimation with the Exponential Family of Distributions
 MACHINE LEARNING
, 2000
"... We consider online density estimation with a parameterized density from the exponential family. The online algorithm receives one example at a time and maintains a parameter that is essentially an average of the past examples. After receiving an example the algorithm incurs a loss, which is the n ..."
Abstract

Cited by 136 (14 self)
 Add to MetaCart
We consider online density estimation with a parameterized density from the exponential family. The online algorithm receives one example at a time and maintains a parameter that is essentially an average of the past examples. After receiving an example the algorithm incurs a loss, which is the negative loglikelihood of the example with respect to the past parameter of the algorithm. An oline algorithm can choose the best parameter based on all the examples. We prove bounds on the additional total loss of the online algorithm over the total loss of the best oline parameter. These relative loss bounds hold for an arbitrary sequence of examples. The goal is to design algorithms with the best possible relative loss bounds. We use a Bregman divergence to derive and analyze each algorithm. These divergences are relative entropies between two exponential distributions. We also use our methods to prove relative loss bounds for linear regression.
Regret in the Online Decision Problem
, 1999
"... At each point in time a decision maker must choose a decision. The payoff in a period from the decision chosen depends on the decision as well as the state of the world that obtains at that time. The difficulty is that the decision must be made in advance of any knowledge, even probabilistic, about ..."
Abstract

Cited by 116 (2 self)
 Add to MetaCart
At each point in time a decision maker must choose a decision. The payoff in a period from the decision chosen depends on the decision as well as the state of the world that obtains at that time. The difficulty is that the decision must be made in advance of any knowledge, even probabilistic, about which state of the world will obtain. A range of problems from a variety of disciplines can be framed in this way. In this
Using and combining predictors that specialize
 In 29th STOC
, 1997
"... Abstract. We study online learning algorithms that predict by combining the predictions of several subordinate prediction algorithms, sometimes called “experts. ” These simple algorithms belong to the multiplicative weights family of algorithms. The performance of these algorithms degrades only loga ..."
Abstract

Cited by 108 (14 self)
 Add to MetaCart
(Show Context)
Abstract. We study online learning algorithms that predict by combining the predictions of several subordinate prediction algorithms, sometimes called “experts. ” These simple algorithms belong to the multiplicative weights family of algorithms. The performance of these algorithms degrades only logarithmically with the number of experts, making them particularly useful in applications where the number of experts is very large. However, in applications such as text categorization, it is often natural for some of the experts to abstain from making predictions on some of the instances. We show how to transform algorithms that assume that all experts are always awake to algorithms that do not require this assumption. We also show how to derive corresponding loss bounds. Our method is very general, and can be applied to a large family of online learning algorithms. We also give applications to various prediction models including decision graphs and “switching ” experts. 1
Universal Portfolios with Side Information
 IEEE Transactions on Information Theory
, 1996
"... We present a sequential investment algorithm, the ¯weighted universal portfolio with sideinformation, which achieves, to first order in the exponent, the same wealth as the best sideinformation dependent investment strategy (the best stateconstant rebalanced portfolio) determined in hindsight fr ..."
Abstract

Cited by 103 (4 self)
 Add to MetaCart
(Show Context)
We present a sequential investment algorithm, the ¯weighted universal portfolio with sideinformation, which achieves, to first order in the exponent, the same wealth as the best sideinformation dependent investment strategy (the best stateconstant rebalanced portfolio) determined in hindsight from observed market and sideinformation outcomes. This is an individual sequence result which shows that the difference between the exponential growth rates of wealth of the best stateconstant rebalanced portfolio and the universal portfolio with sideinformation is uniformly less than (d=(2n)) log(n + 1) + (k=n) log 2 for every stock market and sideinformation sequence and for all time n. Here d = k(m \Gamma 1) is the number of degrees of freedom in the stateconstant rebalanced portfolio with k states of sideinformation and m stocks. The proof of this result establishes a close connection between universal investment and universal data compression. Keywords: Universal investment, univ...
The multiplicative weights update method: a meta algorithm and applications
, 2005
"... Algorithms in varied fields use the idea of maintaining a distribution over a certain set and use the multiplicative update rule to iteratively change these weights. Their analysis are usually very similar and rely on an exponential potential function. We present a simple meta algorithm that unifies ..."
Abstract

Cited by 100 (12 self)
 Add to MetaCart
Algorithms in varied fields use the idea of maintaining a distribution over a certain set and use the multiplicative update rule to iteratively change these weights. Their analysis are usually very similar and rely on an exponential potential function. We present a simple meta algorithm that unifies these disparate algorithms and drives them as simple instantiations of the meta algorithm. 1
Online portfolio selection using multiplicative updates
 Mathematical Finance
, 1998
"... We present an online investment algorithm which achieves almost the same wealth as the best constantrebalanced portfolio determined in hindsight from the actual market outcomes. The algorithm employs a multiplicative update rule derived using a framework introduced by Kivinen and Warmuth. Our algo ..."
Abstract

Cited by 86 (10 self)
 Add to MetaCart
(Show Context)
We present an online investment algorithm which achieves almost the same wealth as the best constantrebalanced portfolio determined in hindsight from the actual market outcomes. The algorithm employs a multiplicative update rule derived using a framework introduced by Kivinen and Warmuth. Our algorithm is very simple to implement and requires only constant storage and computing time per stock ineach trading period. We tested the performance of our algorithm on real stock data from the New York Stock Exchange accumulated during a 22year period. On this data, our algorithm clearly outperforms the best single stock aswell as Cover's universal portfolio selection algorithm. We also present results for the situation in which the We present an online investment algorithm which achieves almost the same wealth as the best constantrebalanced portfolio investment strategy. The algorithm employsamultiplicative update rule derived using a framework introduced by Kivinen and Warmuth [20]. Our algorithm is very simple to implement and its time and storage requirements grow linearly in the number of stocks.
Asymptotic calibration
, 1998
"... Can we forecast the probability of an arbitrary sequence of events happening so that the stated probability of an event happening is close to its empirical probability? We can view this prediction problem as a game played against Nature, where at the beginning of the game Nature picks a data sequenc ..."
Abstract

Cited by 85 (4 self)
 Add to MetaCart
Can we forecast the probability of an arbitrary sequence of events happening so that the stated probability of an event happening is close to its empirical probability? We can view this prediction problem as a game played against Nature, where at the beginning of the game Nature picks a data sequence and the forecaster picks a forecasting algorithm. If the forecaster is not allowed to randomise, then Nature wins; there will always be data for which the forecaster does poorly. This paper shows that, if the forecaster can randomise, the forecaster wins in the sense that the forecasted probabilities and the empirical probabilities can be made arbitrarily close to each other.