by
A Block-sorting Lossless
,
Michael Burrows
,
M. Burrows
,
David Wheeler
,
D. J. Wheeler
Digital SRC Research Report
Add To MetaCart
Abstract:
We describe a block-sorting, lossless data compression algorithm, and our implementation of that algorithm. We compare the performance of our implementation with widely available data compressors running on the same hardware. The algorithm works by applying a reversible transformation to a block of input text. The transformation does not itself compress the data, but re-orders it to make it easy to compress with simple algorithms such as move-to-front encoding. Our algorithm achieves speed comparable to algorithms based on the techniques of Lempel and Ziv, but optains compression close to the best statistical modelling techniques. The size of the input block must be large (a few kilobytes) to achieve good compression.
Citations
|
801
|
A Universal Algorithm for Sequential Data Compression
– Lempel, Ziv
- 1977
|
|
538
|
Modeling for Text Compression
– Bell, Witten, et al.
- 1989
|
|
516
|
Compression of individual sequences via variable-rate coding
– Ziv, Lempel
- 1978
|
|
452
|
Suffix Arrays: A New Method for On-line String Searches
– Manber, Myers
- 1990
|
|
429
|
A space-economical suffix tree construction algorithm
– McCreight
- 1976
|
|
333
|
A technique for high-performance data compression
– Welch
- 1984
|
|
11
|
Arithmetic coding and statistical modelling
– Nelson
- 1991
|
|
10
|
A locally adaptive data compression algorithm
– Bentley, Sleator, et al.
- 1986
|
|
8
|
The Calgary/Canterbury text compression corpus. Anonymous ftp from ftp.cpsc.ucalgary.ca: /pub/text.compression.corpus/text.compression.corpus.tar.Z
– Witten, Bell
|
|
4
|
Connecting words with definitions
– Hector
- 1992
|
|
3
|
et al. Compress, Version 4.2.3. Posted to the Internet newsgroup comp.sources.reviewed
– Jannesen
- 1992
|
|
2
|
et al. Gzip, Version 1.2.4. Anonymous ftp from prep.ai.mit.edu: /pub/gnu/gzip-1.2.4.tar.gz
– Gailly
|