## Asymptotic Properties Of Data Compression And Suffix Trees (1993)

Venue: | IEEE Trans. Inform. Theory |

Citations: | 40 - 11 self |

### BibTeX

@ARTICLE{Szpankowski93asymptoticproperties,

author = {Wojciech Szpankowski},

title = {Asymptotic Properties Of Data Compression And Suffix Trees},

journal = {IEEE Trans. Inform. Theory},

year = {1993},

volume = {39},

pages = {1647--1659}

}

### Years of Citing Articles

### OpenURL

### Abstract

Recently, Wyner and Ziv have proved that the typical length of a repeated subword found within the first n positions of a stationary ergodic sequence is (1=h) log n in probability where h is the entropy of the alphabet. This finding was used to obtain several insights into certain universal data compression schemes, most notably the Lempel-Ziv data compression algorithm. Wyner and Ziv have also conjectured that their result can be extended to a stronger almost sure convergence. In this paper, we settle this conjecture in the negative in the so called right domain asymptotic, that is, during a dynamic phase of expanding the data base. We prove -- under an additional assumption involving mixing conditions -- that the length of a typical repeated subword oscillates almost surely (a.s.) between (1=h 1 ) log n and (1=h 2 ) log n where 0 ! h 2 ! h h 1 ! 1. We also show that the length of the nth block in the Lempel-Ziv parsing algorithm reveals a similar behavior. We relate our findings to...