Results 1  10
of
187
Variable Length Markov Chains
 Annals of Statistics
, 1999
"... We study estimation in the class of stationary variable length Markov chains (VLMC) on a finite space. The processes in this class are still Markovian of higher order, but with memory of variable length yielding a much bigger and structurally richer class of models than ordinary higher order Markov ..."
Abstract

Cited by 86 (5 self)
 Add to MetaCart
We study estimation in the class of stationary variable length Markov chains (VLMC) on a finite space. The processes in this class are still Markovian of higher order, but with memory of variable length yielding a much bigger and structurally richer class of models than ordinary higher order Markov chains. From a more algorithmic view, the VLMC model class has attracted interest in information theory and machine learning but statistical properties have not been explored very much. Provided that good estimation is available, an additional structural richness of the model class enhances predictive power by finding a better tradeoff between model bias and variance and allows better structural description which can be of specific interest. The latter is exemplified with some DNA data. A version of the treestructured context algorithm, proposed by Rissanen (1983) in an information theoretical setup, is shown to have new good asymptotic properties for estimation in the class of VLMC's, even when the underlying model increases in dimensionality: consistent estimation of minimal state spaces and mixing properties of fitted models are given. We also propose a new bootstrap scheme based on fitted VLMC's. We show its validity for quite general stationary categorical time series and for a broad range of statistical procedures. AMS 1991 subject classifications. Primary 62M05; secondary 60J10, 62G09, 62M10, 94A15 Key words and phrases. Bootstrap, categorical time series, central limit theorem, context algorithm, data compression, finitememory sources, FSMX model, KullbackLeibler distance, model selection, tree model. Short title: Variable Length Markov Chain 1 Research supported in part by the Swiss National Science Foundation. Part of the work has been done while visiting th...
Convergence of a stochastic approximation version of the EM algorithm
, 1997
"... The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions... ..."
Abstract

Cited by 85 (8 self)
 Add to MetaCart
The Expectation Maximization (EM) algorithm is a powerful computational technique for locating maxima of functions...
Learning nearoptimal policies with Bellmanresidual minimization based fitted policy iteration and a single sample path
 MACHINE LEARNING JOURNAL (2008) 71:89129
, 2008
"... ..."
Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions
 PROBABILITY SURVEYS
, 2005
"... This is an update of, and a supplement to, the author’s earlier survey paper [18] on basic properties of strong mixing conditions. That paper appeared in 1986 in a book containing survey papers on various types of dependence conditions and the limit theory under them. The survey here will include pa ..."
Abstract

Cited by 45 (0 self)
 Add to MetaCart
This is an update of, and a supplement to, the author’s earlier survey paper [18] on basic properties of strong mixing conditions. That paper appeared in 1986 in a book containing survey papers on various types of dependence conditions and the limit theory under them. The survey here will include part (but not all) of the material in [18], and will also describe some relevant material that was not in that paper, especially some new discoveries and developments that have occurred since that paper was published. (Much of the new material described here involves “interlaced ” strong mixing conditions, in which the index sets are not restricted to “past ” and “future.”) At various places in this survey, open problems will be posed. There is a large literature on basic properties of strong mixing conditions. A survey such as this cannot do full justice to it. Here are a few references on important topics not covered in this survey. For the approximation of mixing sequences by martingale differences, see e.g. the book by Hall and Heyde [80]. For the direct approximation of mixing random variables by independent ones,
NonStationarities in Financial Time Series, the Long Range Dependence and the IGARCH Effects
 Review of Economics and Statistics
, 2002
"... In this paper we give the theoretical basis of a possible explanation for two stylized facts observed in long logreturn series: the long range dependence (LRD) in volatility and the integrated GARCH (IGARCH). Both these eects can be theoretically explained if one assumes that the data is nonsta ..."
Abstract

Cited by 42 (5 self)
 Add to MetaCart
In this paper we give the theoretical basis of a possible explanation for two stylized facts observed in long logreturn series: the long range dependence (LRD) in volatility and the integrated GARCH (IGARCH). Both these eects can be theoretically explained if one assumes that the data is nonstationary.
HigherOrder Improvements of a Computationally Attractive k−Step Bootstrap for Extremum Estimators
 Econometrica
, 2002
"... COWLES FOUNDATION DISCUSSION PAPER NO. 1230 ..."
Nonparametric time series prediction through adaptive model selection
 Machine Learning
, 2000
"... Abstract. We consider the problem of onestep ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and ada ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
Abstract. We consider the problem of onestep ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and adapt the uniform convergence framework of Vapnik and Chervonenkis to the problem of time series prediction, obtaining finite sample bounds. Furthermore, by allowing both the model complexity and memory size to be adaptively determined by the data, we derive nonparametric rates of convergence through an extension of the method of structural risk minimization suggested by Vapnik. All our results are derived for general L p error measures, and apply to both exponentially and algebraically mixing processes.
Change of structure in financial time series, long range dependence and the GARCH model
, 1999
"... Functionals of a twoparameter integrated periodogram have been used for a long time for detecting changes in the spectral distribution of a stationary sequence. The bases for these results are functional central limit theorems for the integrated periodogram having as limit a Gaussian field. In the ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
Functionals of a twoparameter integrated periodogram have been used for a long time for detecting changes in the spectral distribution of a stationary sequence. The bases for these results are functional central limit theorems for the integrated periodogram having as limit a Gaussian field. In the case of GARCH(p; q) processes a statistic closely related to the integrated periodogram can be used for the purpose of change detection in the model. We derive a central limit theorem for this statistic under the hypothesis of a GARCH(p; q) sequence with a finite 4th moment. When applied to reallife time series our method gives clear evidence of the fast pace of change in the data. One of the straightforward conclusions of our study is the infeasibility of modeling long return series with one GARCH model. The parameters of the model must be updated and we propose a method to detect when the update is needed. Our study supports the hypothesis of global nonstationarity of the return time ser...
Estimating Functions for Discretely Sampled DiffusionType Models. Chapter of the Handbook of financial econometrics, AitSahalia and Hansen eds. http://home.uchicago.edu/ lhansen/handbook.htm Birgé
 in Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics
, 2004
"... Estimating functions provide a general framework for finding estimators and studying their properties in many different kinds of statistical models, including stochastic process models. An estimating function is a function of the data as well as of the parameter to be estimated. An estimator is obta ..."
Abstract

Cited by 26 (9 self)
 Add to MetaCart
Estimating functions provide a general framework for finding estimators and studying their properties in many different kinds of statistical models, including stochastic process models. An estimating function is a function of the data as well as of the parameter to be estimated. An estimator is obtained by equating the estimating function to zero and solving the resulting
MemoryUniversal Prediction of Stationary Random Processes
 IEEE Trans. Inform. Theory
, 1998
"... We consider the problem of onestepahead prediction of a realvalued, stationary, strongly mixing random process fX i g i=01 . The best meansquare predictor of X0 is its conditional mean given the entire infinite past fX i g i=01 . Given a sequence of observations X1 X2 111 XN, we propose estimato ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
We consider the problem of onestepahead prediction of a realvalued, stationary, strongly mixing random process fX i g i=01 . The best meansquare predictor of X0 is its conditional mean given the entire infinite past fX i g i=01 . Given a sequence of observations X1 X2 111 XN, we propose estimators for the conditional mean based on sequences of parametric models of increasing memory and of increasing dimension, for example, neural networks and Legendre polynomials. The proposed estimators select both the model memory and the model dimension, in a datadriven fashion, by minimizing certain complexity regularized least squares criteria. When the underlying predictor function has a finite memory, we establish that the proposed estimators are memoryuniversal: the proposed estimators, which do not know the true memory, deliver the same statistical performance (rates of integrated meansquared error) as that delivered by estimators that know the true memory. Furthermore, when the underlying predictor function does not have a finite memory, we establish that the estimator based on Legendre polynomials is consistent.