In this paper we describe the Burrows-Wheeler Transform (BWT) a completely new approach to data compression which is the basis of some of the best compressors available today. Although it is easy to intuitively understand why the BWT helps compression, the analysis of BWT-based algorithms requires a careful study of every single algorithmic component. We describe two algorithms which use the BWT and we show that their compression ratio can be bounded in terms of the k-th order empirical entropy of the input string for any k 0. Intuitively, this means that these algorithms are able to make use of all the regularity which is in the input string.
|
517
|
Arithmetic coding for data compression
– Witten, Neal, et al.
- 1987
|
|
449
|
Suffix arrays: a new method for on-line string searches
– Manber, Myers
- 1993
|
|
429
|
A space-economical suffix tree construction algorithm
– McCreight
- 1976
|
|
340
|
A block sorting lossless data compression algorithm
– Burrows, Wheeler
- 1994
|
|
312
|
Linear pattern matching algorithm
– Weiner
- 1973
|
|
203
|
On-line construction of suffix trees
– Ukkonen
- 1995
|
|
111
|
A locally adaptive data compression scheme
– Bentley, Sleator, et al.
- 1986
|
|
108
|
Fast Algorithms for Sorting and Searching Strings
– Bentley, Sedgewick
|
|
92
|
Implementing the PPM Data Compression Scheme
– Moffat
- 1990
|
|
91
|
Reducing the Space Requirement of Suffix Trees
– Kurtz
- 1999
|
|
83
|
Unbounded Length Contexts for PPM
– Cleary, Teahan, et al.
- 1995
|
|
78
|
Data Compression: The Complete Reference
– SALOMON
- 2000
|
|
71
|
An analysis of the Burrows-Wheeler transform
– Mäkinen, Manzini, et al.
- 2001
|
|
70
|
String matching in Lempel-Ziv compressed strings
– Farach, Thorup
- 1998
|
|
65
|
The design and analysis of dynamic huffman codes
– Vitter
- 1987
|
|
61
|
R.N.Horspool, `Data Compression Using Dynamic Markov Modeling
– Cormack
- 1987
|
|
35
|
Faster Suffix Sorting
– Larsson, Sadakane
- 1999
|
|
30
|
Analysis of Arithmetic Coding for Data Compression
– Howard, Vitter
- 1992
|
|
28
|
Efficient implementation of suffix trees
– Andersson, Nilsson
- 1995
|
|
28
|
Verdú: “Universal lossless source coding with the Burrows-Wheeler transform
– Effros, Visweswariah, et al.
- 2002
|
|
24
|
The Burrows-Wheeler transform for block sorting text compression: principles and improvements
– Fenwick
- 1996
|
|
24
|
A fast block-sorting algorithm for lossless data compression
– Schindler
- 1997
|
|
21
|
Data Compression with the Burrows-Wheeler Transform
– Nelson
- 1996
|
|
20
|
Modifications of the Burrows and Wheeler data compression algorithm
– Balkenhol, Kurtz, et al.
|
|
16
|
Block sorting text compression — final report
– Fenwick
- 1996
|
|
16
|
Compression of low entropy strings with Lempel-Ziv algorithms
– Kosaraju, Manzini
- 1999
|
|
13
|
The context trees of block sorting compression
– Larsson
- 1998
|
|
11
|
Universal data compression based on the Burrows and Wheelertransformation: Theory and practice
– Balkenhol, Kurtz
- 1998
|
|
9
|
The bzip2 home page
– Seward
- 1997
|
|
7
|
Data Compression by Means of a Book Stack
– Ryabko
- 1980
|
|
6
|
Text compression using recency rank with context and relation to context sorting, block sorting and PPM
– Sadakane
- 1997
|
|
5
|
The Canterbury corpus home
– Arnold, Bell
|
|
5
|
On optimality of variants of the block sorting compression
– Sadakane
- 1998
|
|
5
|
A modified Burrows-Wheeler transformation for case-insensitive search with application to suffix array compression
– Sadakane
- 1999
|
|
3
|
Symbol ranking text compression with Shannon recoding
– Fenwick
- 1997
|
|
2
|
Efficient algorithms for on-line symbol ranking compression
– Manzini
- 1999
|
|
2
|
The szip home page
– Schindler
- 1997
|
|
1
|
Data compression using a sort-based similarity measure
– Yokoo
- 1997
|