## Probability estimation for PPM (1995)

Venue: | In Proceedings NZCSRSC'95. Available from http://www.cs.waikato.ac.nz/wjt |

### BibTeX

@INPROCEEDINGS{Teahan95probabilityestimation,

author = {W. J. Teahan},

title = {Probability estimation for PPM},

booktitle = {In Proceedings NZCSRSC'95. Available from http://www.cs.waikato.ac.nz/wjt},

year = {1995},

pages = {papers/NZCSRSC.ps.gz}

}

### Abstract

The state of the art in lossless text compression is the PPM data compression scheme. Two approaches to the problem of selecting the context models used in the scheme are described. One uses an a priori upper bound on the lengths of the contexts, while the other method is unbounded. Several techniques that improve the probability estimation are described, including four new methods: partial update exclusions for the unbounded approach, deterministic scaling, recency scaling and multiple probability estimators. Each of these methods improves the performance for both the bounded and unbounded approaches. In addition, further savings are possible by combining the two approaches. 1 Introduction The state of the art in lossless text compression is the PPM data compression scheme [1, 4]. PPM, or prediction by partial matching, is an adaptive statistical modeling technique based on blending together different length context models to predict the next character in the input sequence. The sche...

