Results 1  10
of
17
Sequential procedures for aggregating arbitrary estimators of a conditional mean
, 2005
"... In this paper we describe and analyze a sequential procedure for aggregating linear combinations of a finite family of regression estimates, with particular attention to linear combinations having coefficients in the generalized simplex. The procedure is based on exponential weighting, and has a com ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
(Show Context)
In this paper we describe and analyze a sequential procedure for aggregating linear combinations of a finite family of regression estimates, with particular attention to linear combinations having coefficients in the generalized simplex. The procedure is based on exponential weighting, and has a computationally tractable approximation. Analysis of the procedure is based in part on techniques from the sequential prediction of nonrandom sequences. Here these techniques are applied in a stochastic setting to obtain cumulative loss bounds for the aggregation procedure. From the cumulative loss bounds we derive an oracle inequality for the aggregate estimator for an unbounded response having a suitable moment generating function. The inequality shows that the risk of the aggregate estimator is less than the risk of the best candidate linear combination in the generalized simplex, plus a complexity term that depends on the size of the coefficient set. The inequality readily yields convergence rates for aggregation over the unit simplex that are within logarithmic factors of known minimax bounds. Some preliminary results on model selection are also presented.
Strategies for Sequential Prediction of Stationary Time Series
, 2001
"... We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combination of several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost sur ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combination of several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost surely, to that of the optimum, given by the Bayes predictor. We oer an analog result for the prediction of stationary gaussian processes. The work of the second author was supported by DGES grant PB960300 0 1
Universal switching linear least squares prediction
 IEEE Trans. on Signal Processing
, 2007
"... Abstract—In this paper, we consider sequential regression of individual sequences under the squareerror loss. We focus on the class of switching linear predictors that can segment a given individual sequence into an arbitrary number of blocks within each of which a fixed linear regressor is appli ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, we consider sequential regression of individual sequences under the squareerror loss. We focus on the class of switching linear predictors that can segment a given individual sequence into an arbitrary number of blocks within each of which a fixed linear regressor is applied. Using a competitive algorithm framework, we construct sequential algorithms that are competitive with the best linear regression algorithms for any segmenting of the data as well as the best partitioning of the data into any fixed number of segments, where both the segmenting of the data and the linear predictors within each segment can be tuned to the underlying individual sequence. The algorithms do not require knowledge of the data length or the number of piecewise linear segments used by the members of the competing class, yet can achieve the performance of the best member that can choose both the partitioning of the sequence as well as the best regressor within each segment. We use a transition diagram (F. M. J. Willems, 1996) to compete with an exponential number of algorithms in the class, using complexity that is linear in the data length. The regret with respect to the best member is (ln ()) per transition for not knowing the best transition times and (ln ()) for not knowing the best regressor within each segment, where is the data length. We construct lower bounds on the performance of any sequential algorithm, demonstrating a form of min–max optimality under certain settings. We also consider the case where the members are restricted to choose the best algorithm in each segment from a finite collection of candidate algorithms. Performance on synthetic and real data are given along with a Matlab implementation of the universal switching linear predictor. Index Terms—Piecewise continuous, prediction, transition diagram, universal. I.
Universal piecewise linear prediction via context trees
 IEEE Transactions on Signal Processing, p. Accepted
, 2006
"... Abstract—This paper considers the problem of piecewise linear prediction from a competitive algorithm approach. In prior work, prediction algorithms have been developed that are “universal” with respect to the class of all linear predictors, such that they perform nearly as well, in terms of total s ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
Abstract—This paper considers the problem of piecewise linear prediction from a competitive algorithm approach. In prior work, prediction algorithms have been developed that are “universal” with respect to the class of all linear predictors, such that they perform nearly as well, in terms of total squared prediction error, as the best linear predictor that is able to observe the entire sequence in advance. In this paper, we introduce the use of a “context tree, ” to compete against a doubly exponential number of piecewise linear (affine) models. We use the context tree to achieve the total squared prediction error performance of the best piecewise linear model that can choose both its partitioning of the regressor space and its realvalued prediction parameters within each region of the partition, based on observing the entire sequence in advance, uniformly, for every bounded individual sequence. This performance is achieved with a prediction algorithm whose complexity is only linear in the depth of the context tree per prediction. Upper bounds on the regret with respect to the best piecewise linear predictor are given for both the scalar and higher order case, and lower bounds on the regret are given for the scalar case. An explicit algorithmic description and examples demonstrating the performance of the algorithm are given. Index Terms—Context tree, piecewise linear, prediction, universal. I.
Universal Context Tree Least Squares Prediction
"... Abstract — We investigate the problem of sequential prediction of individual sequences using a competitive algorithm approach. We have previously developed prediction algorithms that are universal with respect to the class of all linear predictors, such that the prediction algorithm competes against ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract — We investigate the problem of sequential prediction of individual sequences using a competitive algorithm approach. We have previously developed prediction algorithms that are universal with respect to the class of all linear predictors, such that the prediction algorithm competes against a continuous class of prediction algorithms, under the square error loss. In this paper, we introduce the use of a “context tree, ” to compete against a doubly exponential number of piecewise linear models. We use the context tree to achieve the performance of the best piecewise linear model that can choose its partition of the real line and realvalued prediction parameters, based on observing the entire sequence in advance, for the square error loss, uniformly, for any individual sequence. This performance is achieved with a prediction algorithm whose complexity is only linear in the depth of the context tree. I.
Universal piecewise linear least squares prediction
 in Proceedings of ISIT
, 2004
"... Abstract — We consider the problem of sequential prediction of realvalued sequences using piecewise linear models under the squareerror loss function. In this context, we demonstrate a sequential algorithm for prediction whose accumulated squared error for every bounded sequence is asymptotically ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract — We consider the problem of sequential prediction of realvalued sequences using piecewise linear models under the squareerror loss function. In this context, we demonstrate a sequential algorithm for prediction whose accumulated squared error for every bounded sequence is asymptotically as small as that of the best fixed predictor for that sequence taken from the class of piecewise linear predictors. We also show that this predictor is optimal in certain settings in a particular minmax sense. This approach can also be applied to the class of piecewise constant predictors, for which a similar universal sequential algorithm can be derived with corresponding minmax optimality. I. Summary In this paper, we consider the problem of predicting a sequence x n = {x[t]} n t=1 as well as the best piecewise linear predictor out of a large, continuous class of piecewise linear predictors. The realvalued sequence x n is assumed to be bounded, i.e. x[t]  ≤A for some A<∞, for all t. Rather than assuming a statistical ensemble of sequences, and attempting to achieve optimal performance according to some statistical criterion, our goal is to predict any sequence x n as well as the best predictor out of a large class of predictors. We first consider the class of fixed scalar piecewise linear predictors as our competition class. For a scalar piecewise linear predictor, the past observation space x[t − 1] ∈ [−A, A] is parsed into K disjoint regions Rj where ⋃K j=1 Rj =[−A, A]. At each time t, the competing predictor forms its prediction as ˆxw j [t] = wjx[t − 1], wj ∈ R, when x[t − 1] ∈ Rj. We assume that the number of regions and the region boundaries are known. Here, we seek to minimize the following regret: n∑ sup x n t=1 (x[t] − ˆxq[t]) 2 − inf
Competitive online linear fir mmse filtering
 in Proc. ISIT, 2007
"... tsmoon @ stanford.edu AbstractWe consider the problem of causal estimation, i.e., filtering, of a realvalued signal corrupted by zero mean, i.i.d., realvalued additive noise under the mean square error (MSE) criterion. We build a competitive online filtering algorithm whose normalized cumulative ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
tsmoon @ stanford.edu AbstractWe consider the problem of causal estimation, i.e., filtering, of a realvalued signal corrupted by zero mean, i.i.d., realvalued additive noise under the mean square error (MSE) criterion. We build a competitive online filtering algorithm whose normalized cumulative MSE, for every bounded underlying signal, is asymptotically as small as the best linear finiteduration impulse response (FIR) filter of order d. We do not assume any stochastic mechanism in generating the underlying signal, and assume only the variance of the noise is known to the filter. The regret of our scheme is shown to decay in the order of O(log n/n), where n is the length of the signal. Moreover, we present a concentration of the average square error of our scheme to that of the best dth order linear FIR filter. Our analysis combines tools from the problems of universal filtering and competitive online regression. I.
A Lower bound on the Performance of Sequential Prediction
"... Abstract We consider the problem of sequential linear prediction of realvalued sequences under the squareerror loss function. For this problem, a prediction algorithm has been demonstrated [l][2] whose accumulated squared prediction error, for every bounded sequence, is asymptotically as small as ..."
Abstract
 Add to MetaCart
Abstract We consider the problem of sequential linear prediction of realvalued sequences under the squareerror loss function. For this problem, a prediction algorithm has been demonstrated [l][2] whose accumulated squared prediction error, for every bounded sequence, is asymptotically as small as the best fixed linear predictor for that sequence, taken from the class of all linear predictors of a given order p. The redundancy, or excess prediction error above that of the best predictor for that sequence, is upper bounded by A2pln(n)/n, where n is the data length and the sequence is assumed to be bounded by some A. In this paper, we show that this predictor is optimal in a minmax sense, by deriving a corresponding lower bound, such that no sequential predictor can ever do better than a redundancy of A2p In(n)/n.
UNIVERSAL PIECEWISE LINEAR REGRESSION OF INDIVIDUAL SEQUENCES: LOWER BOUND
"... We consider universal piecewise linear regression of real valued bounded sequences under the squared loss function. In this setting, we present a lower bound on the regret of a universal sequential piecewise linear regressor compared to the best piecewise linear regressor that has access to the enti ..."
Abstract
 Add to MetaCart
(Show Context)
We consider universal piecewise linear regression of real valued bounded sequences under the squared loss function. In this setting, we present a lower bound on the regret of a universal sequential piecewise linear regressor compared to the best piecewise linear regressor that has access to the entire sequence in advance. This lower bound are tight with the corresponding upper bounds, suggesting a minmax optimality of the sequential regressor, for every individual bounded sequence. Index Terms — Regression, piecewise linear, universal 1.