Results 1 
6 of
6
Block Sorting Text Compression  Final Report
, 1996
"... A recent development in text compression is a "block sorting" algorithm which permutes the input text according to a special sort procedure and then processes the permuted text with MovetoFront and a final statistical compressor. The technique combines good speed with excellent compressi ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
A recent development in text compression is a "block sorting" algorithm which permutes the input text according to a special sort procedure and then processes the permuted text with MovetoFront and a final statistical compressor. The technique combines good speed with excellent compression performance. This report investigates the block sorting compression algorithm, in particular trying to understand its operation and limitations. Various approaches are investigated in an attempt to improve the compression with block sorting, most of which involve a hierarchy of coding models to allow fast adaptation to local contexts. The best technique involves a new "structured" coding model, especially designed for compressing data with skew symbol distributions. Block sorting compression is found to be related to work by Shannon in 1951 on the prediction of English text. The work confirms blocksorting as a good text compression technique, with a compression approaching that of the currently be...
Block Sorting and Compression
 Proceedings of the IEEE Data Compression Conference
, 1997
"... The Block Sorting Lossless Data Compression Algorithm (BSLDCA) described by Burrows and Wheeler [3] has received considerable attention. It achieves as good compression rates as contextbased methods, such as PPM, but at execution speeds closer to ZivLempel techniques [5]. This paper, describes the ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
The Block Sorting Lossless Data Compression Algorithm (BSLDCA) described by Burrows and Wheeler [3] has received considerable attention. It achieves as good compression rates as contextbased methods, such as PPM, but at execution speeds closer to ZivLempel techniques [5]. This paper, describes the Lexical Permutation Sorting Algorithm (LPSA), its theoretical basis, and delineates its relationship to BSLDCA. In particular we describe how BSLDCA can be reduced to LPSA and show how LPSA could give better results than BSLDCA when transmitting permutations. We also introduce a new technique, Inversion Frequencies, and show that it does as well as MovetoFront (MTF) Coding when there is locality of reference in the data. 1 Introduction Burrows and Wheeler [3], introduced a new algorithm, which they call the Block Sorting Lossless Data Compression Algorithm (BSLDCA). When applied to text or image data their algorithm achieves better compression rates than ZivLempel techniques with compa...
Symbol Ranking Text Compression
, 1996
"... In his work on the information content of English text in 1951, Shannon described a method of recoding the input text, a technique which has apparently lain dormant for the ensuing 45 years. Whereas traditional compressors exploit symbol frequencies and symbol contexts, Shannon's method adds th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In his work on the information content of English text in 1951, Shannon described a method of recoding the input text, a technique which has apparently lain dormant for the ensuing 45 years. Whereas traditional compressors exploit symbol frequencies and symbol contexts, Shannon's method adds the concept of "symbol ranking", as in `the next symbol is the one 3rd most likely in the present context'. This report describes an implementation of his method and shows that it forms the basis of a good text compressor. 1 The recent "acb" compressor of Buynovsky is shown to belong to the general class of symbol ranking compressors. Keywords text compression, Shannon, symbol ranking 1 This report has been submitted as a paper to the Journal of Universal Computer Science. It is available by anonymous ftp from ftp.cs.auckland.ac.nz /out/peterf/TechRep132 1. Introduction In 1951 C.E. Shannon published his classic paper on the information content of English text, establishing the wellknown bo...
unknown title
"... A recent development in text compression is a “block sorting ” algorithm which permutes the input text according to a special sort procedure and then processes the permuted text with MoveToFront and a final statistical compressor. The technique is fast, with a compression performance ranking it am ..."
Abstract
 Add to MetaCart
(Show Context)
A recent development in text compression is a “block sorting ” algorithm which permutes the input text according to a special sort procedure and then processes the permuted text with MoveToFront and a final statistical compressor. The technique is fast, with a compression performance ranking it among the best of the known compressors. This paper describes work on the block sorting algorithm, especially establishing its relation to other compressors and attempting to improve its compression performance. It is already known to be a form of statistical compressor with unbounded contexts; we show that the contexts are so completely restructured by the sorting that many standard file compression techniques are no longer appropriate. Various approaches are investigated in an attempt to improve the compression, most of which involve a hierarchy of coding models to allow fast adaptation to local contexts. The better coding techniques include one derived from work of Shannon in 1951 in establishing the entropy of English text, while the best employs a novel model especially designed for skewed probability distributions. The work in this paper confirms blocksorting as a viable text compression technique, with a compression approaching that of the currently best compressors while being much faster than many other compressors of comparable performance.
Burrows Wheeler Compression
, 2002
"... Author’s Note. The material of this chapter, while quoting extensively from other work, is in part a summary of my own experience and thoughts in working with the BurrowsWheeler compression algorithm. Some of it is accordingly rather less formal in style than might otherwise be the case, as I give ..."
Abstract
 Add to MetaCart
(Show Context)
Author’s Note. The material of this chapter, while quoting extensively from other work, is in part a summary of my own experience and thoughts in working with the BurrowsWheeler compression algorithm. Some of it is accordingly rather less formal in style than might otherwise be the case, as I give more personal opinions on various aspects. Where my own work already appears in the public domain it is cited in the normal way, but unpublished material simply refers to “the author”. 1 Introduction. Block Sorting compression, or “BurrowsWheeler compression ” is a relatively new algorithm of good compression and speed, first presented by Burrows and Wheeler in 1994[1] although Wheeler had discovered the basic algorithm some 10 years earlier. In contrast to most other compression algorithms it treats the incoming text as a block, or sequence of blocks, with transformations on each block.