## Variable Length Markov Chains (1999)

### Cached

### Download Links

- [www-stat.wharton.upenn.edu]
- [oz.berkeley.edu]
- [stat-ftp.berkeley.edu]
- [stat-www.berkeley.edu]
- [www.stat.berkeley.edu]
- CiteULike

### Other Repositories/Bibliography

Venue: | Annals of Statistics |

Citations: | 94 - 5 self |

### BibTeX

@ARTICLE{Buhlmann99variablelength,

author = {Peter Buhlmann and Abraham J. Wyner},

title = {Variable Length Markov Chains},

journal = {Annals of Statistics},

year = {1999},

volume = {27},

pages = {480--513}

}

### Years of Citing Articles

### OpenURL

### Abstract

We study estimation in the class of stationary variable length Markov chains (VLMC) on a finite space. The processes in this class are still Markovian of higher order, but with memory of variable length yielding a much bigger and structurally richer class of models than ordinary higher order Markov chains. From a more algorithmic view, the VLMC model class has attracted interest in information theory and machine learning but statistical properties have not been explored very much. Provided that good estimation is available, an additional structural richness of the model class enhances predictive power by finding a better trade-off between model bias and variance and allows better structural description which can be of specific interest. The latter is exemplified with some DNA data. A version of the tree-structured context algorithm, proposed by Rissanen (1983) in an information theoretical set-up, is shown to have new good asymptotic properties for estimation in the class of VLMC's, even when the underlying model increases in dimensionality: consistent estimation of minimal state spaces and mixing properties of fitted models are given. We also propose a new bootstrap scheme based on fitted VLMC's. We show its validity for quite general stationary categorical time series and for a broad range of statistical procedures. AMS 1991 subject classifications. Primary 62M05; secondary 60J10, 62G09, 62M10, 94A15 Key words and phrases. Bootstrap, categorical time series, central limit theorem, context algorithm, data compression, finite-memory sources, FSMX model, Kullback-Leibler distance, model selection, tree model. Short title: Variable Length Markov Chain 1 Research supported in part by the Swiss National Science Foundation. Part of the work has been done while visiting th...