Results 1  10
of
17
Sequential procedures for aggregating arbitrary estimators of a conditional mean
, 2005
"... In this paper we describe and analyze a sequential procedure for aggregating linear combinations of a finite family of regression estimates, with particular attention to linear combinations having coefficients in the generalized simplex. The procedure is based on exponential weighting, and has a com ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
(Show Context)
In this paper we describe and analyze a sequential procedure for aggregating linear combinations of a finite family of regression estimates, with particular attention to linear combinations having coefficients in the generalized simplex. The procedure is based on exponential weighting, and has a computationally tractable approximation. Analysis of the procedure is based in part on techniques from the sequential prediction of nonrandom sequences. Here these techniques are applied in a stochastic setting to obtain cumulative loss bounds for the aggregation procedure. From the cumulative loss bounds we derive an oracle inequality for the aggregate estimator for an unbounded response having a suitable moment generating function. The inequality shows that the risk of the aggregate estimator is less than the risk of the best candidate linear combination in the generalized simplex, plus a complexity term that depends on the size of the coefficient set. The inequality readily yields convergence rates for aggregation over the unit simplex that are within logarithmic factors of known minimax bounds. Some preliminary results on model selection are also presented.
Strategies for Sequential Prediction of Stationary Time Series
, 2001
"... We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combination of several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost sur ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combination of several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost surely, to that of the optimum, given by the Bayes predictor. We oer an analog result for the prediction of stationary gaussian processes. The work of the second author was supported by DGES grant PB960300 0 1
Universal switching linear least squares prediction
 IEEE Trans. Signal Process
, 2007
"... ..."
(Show Context)
Universal piecewise linear prediction via context trees
 IEEE Transactions on Signal Processing, p. Accepted
, 2006
"... Abstract—This paper considers the problem of piecewise linear prediction from a competitive algorithm approach. In prior work, prediction algorithms have been developed that are “universal” with respect to the class of all linear predictors, such that they perform nearly as well, in terms of total s ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
Abstract—This paper considers the problem of piecewise linear prediction from a competitive algorithm approach. In prior work, prediction algorithms have been developed that are “universal” with respect to the class of all linear predictors, such that they perform nearly as well, in terms of total squared prediction error, as the best linear predictor that is able to observe the entire sequence in advance. In this paper, we introduce the use of a “context tree, ” to compete against a doubly exponential number of piecewise linear (affine) models. We use the context tree to achieve the total squared prediction error performance of the best piecewise linear model that can choose both its partitioning of the regressor space and its realvalued prediction parameters within each region of the partition, based on observing the entire sequence in advance, uniformly, for every bounded individual sequence. This performance is achieved with a prediction algorithm whose complexity is only linear in the depth of the context tree per prediction. Upper bounds on the regret with respect to the best piecewise linear predictor are given for both the scalar and higher order case, and lower bounds on the regret are given for the scalar case. An explicit algorithmic description and examples demonstrating the performance of the algorithm are given. Index Terms—Context tree, piecewise linear, prediction, universal. I.
Universal Context Tree Least Squares Prediction
"... Abstract — We investigate the problem of sequential prediction of individual sequences using a competitive algorithm approach. We have previously developed prediction algorithms that are universal with respect to the class of all linear predictors, such that the prediction algorithm competes against ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract — We investigate the problem of sequential prediction of individual sequences using a competitive algorithm approach. We have previously developed prediction algorithms that are universal with respect to the class of all linear predictors, such that the prediction algorithm competes against a continuous class of prediction algorithms, under the square error loss. In this paper, we introduce the use of a “context tree, ” to compete against a doubly exponential number of piecewise linear models. We use the context tree to achieve the performance of the best piecewise linear model that can choose its partition of the real line and realvalued prediction parameters, based on observing the entire sequence in advance, for the square error loss, uniformly, for any individual sequence. This performance is achieved with a prediction algorithm whose complexity is only linear in the depth of the context tree. I.
Universal piecewise linear least squares prediction
 in Proceedings of ISIT
, 2004
"... Abstract — We consider the problem of sequential prediction of realvalued sequences using piecewise linear models under the squareerror loss function. In this context, we demonstrate a sequential algorithm for prediction whose accumulated squared error for every bounded sequence is asymptotically ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract — We consider the problem of sequential prediction of realvalued sequences using piecewise linear models under the squareerror loss function. In this context, we demonstrate a sequential algorithm for prediction whose accumulated squared error for every bounded sequence is asymptotically as small as that of the best fixed predictor for that sequence taken from the class of piecewise linear predictors. We also show that this predictor is optimal in certain settings in a particular minmax sense. This approach can also be applied to the class of piecewise constant predictors, for which a similar universal sequential algorithm can be derived with corresponding minmax optimality. I. Summary In this paper, we consider the problem of predicting a sequence x n = {x[t]} n t=1 as well as the best piecewise linear predictor out of a large, continuous class of piecewise linear predictors. The realvalued sequence x n is assumed to be bounded, i.e. x[t]  ≤A for some A<∞, for all t. Rather than assuming a statistical ensemble of sequences, and attempting to achieve optimal performance according to some statistical criterion, our goal is to predict any sequence x n as well as the best predictor out of a large class of predictors. We first consider the class of fixed scalar piecewise linear predictors as our competition class. For a scalar piecewise linear predictor, the past observation space x[t − 1] ∈ [−A, A] is parsed into K disjoint regions Rj where ⋃K j=1 Rj =[−A, A]. At each time t, the competing predictor forms its prediction as ˆxw j [t] = wjx[t − 1], wj ∈ R, when x[t − 1] ∈ Rj. We assume that the number of regions and the region boundaries are known. Here, we seek to minimize the following regret: n∑ sup x n t=1 (x[t] − ˆxq[t]) 2 − inf
Competitive online linear fir mmse filtering
 in Proc. ISIT, 2007
"... tsmoon @ stanford.edu AbstractWe consider the problem of causal estimation, i.e., filtering, of a realvalued signal corrupted by zero mean, i.i.d., realvalued additive noise under the mean square error (MSE) criterion. We build a competitive online filtering algorithm whose normalized cumulative ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
tsmoon @ stanford.edu AbstractWe consider the problem of causal estimation, i.e., filtering, of a realvalued signal corrupted by zero mean, i.i.d., realvalued additive noise under the mean square error (MSE) criterion. We build a competitive online filtering algorithm whose normalized cumulative MSE, for every bounded underlying signal, is asymptotically as small as the best linear finiteduration impulse response (FIR) filter of order d. We do not assume any stochastic mechanism in generating the underlying signal, and assume only the variance of the noise is known to the filter. The regret of our scheme is shown to decay in the order of O(log n/n), where n is the length of the signal. Moreover, we present a concentration of the average square error of our scheme to that of the best dth order linear FIR filter. Our analysis combines tools from the problems of universal filtering and competitive online regression. I.
Universal Linear Least Squares Prediction:
"... Abstract—We consider the problem of sequential linear prediction of realvalued sequences under the squareerror loss function. For this problem, a prediction algorithm has been demonstrated [1]–[3] whose accumulated squared prediction error, for every bounded sequence, is asymptotically as small as ..."
Abstract
 Add to MetaCart
Abstract—We consider the problem of sequential linear prediction of realvalued sequences under the squareerror loss function. For this problem, a prediction algorithm has been demonstrated [1]–[3] whose accumulated squared prediction error, for every bounded sequence, is asymptotically as small as the best fixed linear predictor for that sequence, taken from the class of all linear predictors of a given order. The redundancy, or excess prediction error above that of the best predictor for that sequence, is upperbounded by ln ( ) , where is the data length and the sequence is assumed to be bounded by some this correspondence, we provide an alternative proof of this result by connecting it with universal probability assignment. We then show that this predictor is optimal in a min–max sense, by deriving a corresponding lower bound, such that no sequential predictor can ever do better than a redundancy of ln ( ). Index Terms—Min–max, prediction, sequential probability assignment, universal algorithms. I.
UNIVERSAL CONTEXT TREE P THORDER LEAST SQUARES PREDICTION
"... We examine the sequential prediction of individual sequences under the square error loss using a competitive algorithm framework. Previous work has described a firstorder algorithm that competes against a doubly exponential number of piecewise linear models. Using context trees, this firstorder alg ..."
Abstract
 Add to MetaCart
(Show Context)
We examine the sequential prediction of individual sequences under the square error loss using a competitive algorithm framework. Previous work has described a firstorder algorithm that competes against a doubly exponential number of piecewise linear models. Using context trees, this firstorder algorithm achieves the performance of the best piecewise linear firstorder model that can choose its prediction parameters observing the entire sequence in advance, uniformly, for any individual sequence, with a complexity only linear in the depth of the context tree. In this paper, we extend these results to a sequential predictor of order p ≥ 1, that asymptotically performs as well as the best piecewise linear pthorder model. 1.