Results 1 
2 of
2
A Bijective String Sorting Transform
, 2009
"... Given a string of characters, the BurrowsWheeler Transform rearranges the characters in it so as to produce another string of the same length which is more amenable to compression techniques such as move to front, runlength encoding, and entropy encoders. We present a variant of the transform whic ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Given a string of characters, the BurrowsWheeler Transform rearranges the characters in it so as to produce another string of the same length which is more amenable to compression techniques such as move to front, runlength encoding, and entropy encoders. We present a variant of the transform which gives rise to similar or better compression value, but, unlike the original, the transform we present is bijective, in that the inverse transformation exists for all strings. Our experiments indicate that using our variant of the transform gives rise to better compression ratio than the original BurrowsWheeler than the original transform. We also show that both the transform and its inverse can be computed in linear time and consuming linear storage. 1
SymbolBased modeling and coding of blockMarkov sources,” submitted for publication in
 IEEE Transactions on Information Theory on Sep
"... Industrystandard lossless compression algorithms (such as LZW) are usually implemented so that they work on bytes as symbols. Experiments indicate that data for which bytes are not the natural choice of symbols compress poorly using these implementations, while algorithms working on a bit level per ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Industrystandard lossless compression algorithms (such as LZW) are usually implemented so that they work on bytes as symbols. Experiments indicate that data for which bytes are not the natural choice of symbols compress poorly using these implementations, while algorithms working on a bit level perform reasonably on bytebased data in addition to having computational advantages resulting from operating on a small alphabet. In this paper, we offer an informationtheoretic explanation to these experimental results by assessing the redundancy (which is approximated by the divergence rate of two source distributions) of a bitbased model when applied to a bytebased source. More specifically, we study the problem of approximating a block Markov source with higher order Markov sources, and show that the divergence rate between a block Markov source and the bestmatching higher order Markov model for that source converges to zero exponentially fast as the memory of the model increases. This result is applied to obtain bounds on the redundancy of certain symbolbased universal codes when they are used for bytealigned sources.