Results 1 
9 of
9
The ContextTree Weighting Method: Basic Properties
 IEEE Trans. Inform. Theory
, 1995
"... We describe a sequential universal data compression procedure for binary tree sources that performs the "double mixture." Using a context tree, this method weights in an efficient recursive way the coding distributions corresponding to all bounded memory tree sources, and achieves a desirable coding ..."
Abstract

Cited by 159 (12 self)
 Add to MetaCart
We describe a sequential universal data compression procedure for binary tree sources that performs the "double mixture." Using a context tree, this method weights in an efficient recursive way the coding distributions corresponding to all bounded memory tree sources, and achieves a desirable coding distribution for tree sources with an unknown model and unknown parameters. Computational and storage complexity of the proposed procedure are both linear in the source sequence length. We derive a natural upper bound on the cumulative redundancy of our method for individual sequences. The three terms in this bound can be identified as coding, parameter, and model redundancy. The bound holds for all source sequence lengths, not only for asymptotically large lengths. The analysis that leads to this bound is based on standard techniques and turns out to be extremely simple. Our upper bound on the redundancy shows that the proposed contexttree weighting procedure is optimal in the sense that it achieves the Rissanen (1984) lower bound.
The consistency of the BIC Markov order estimator.
"... . The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj ..."
Abstract

Cited by 55 (3 self)
 Add to MetaCart
. The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet A) from observation of a sample path x 1 ; x 2 ; : : : ; x n , as that value k = k that minimizes the sum of the negative logarithm of the kth order maximum likelihood and the penalty term jAj k (jAj\Gamma1) 2 log n: We show that k equals the correct order of the chain, eventually almost surely as n ! 1, thereby strengthening earlier consistency results that assumed an apriori bound on the order. A key tool is a strong ratiotypicality result for Markov sample paths. We also show that the Bayesian estimator or minimum description length estimator, of which the BIC estimator is an approximation, fails to be consistent for the uniformly distributed i.i.d. process. AMS 1991 subject classification: Primary 62F12, 62M05; Secondary 62F13, 60J10 Key words and phrases: Bayesian Information Criterion, order estimation, ratiotypicality, Markov chains. 1 Supported in part by a joint N...
Low Complexity Sequential Lossless Coding for Piecewise Stationary Memoryless Sources
 IEEE Transactions on Information Theory
, 1999
"... Abstract — Three strongly sequential, lossless compression schemes, one with linearly growing perletter computational complexity, and two with fixed perletter complexity, are presented and analyzed for memoryless sources with abruptly changing statistics. The first method, which improves on Willem ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
Abstract — Three strongly sequential, lossless compression schemes, one with linearly growing perletter computational complexity, and two with fixed perletter complexity, are presented and analyzed for memoryless sources with abruptly changing statistics. The first method, which improves on Willems’ weighting approach, asymptotically achieves a lower bound on the redundancy, and hence is optimal. The second scheme achieves redundancy of O (log N=N) when the transitions in the statistics are large, and O (log log N = log N) otherwise. The third approach always achieves redundancy of O ( log N=N). Obviously, the two fixed complexity approaches can be easily combined to achieve the better redundancy between the two. Simulation results support the analytical bounds derived for all the coding schemes. Index Terms — Change detection, ideal code length, minimum description length, piecewisestationary memoryless source, redundancy, segmentation, sequential coding, source block code, strongly sequential coding, transition path, universal coding, weighting. I.
Sequential prediction and ranking in universal context modeling and data compression
 IEEE Trans. Inform. Theory
, 1997
"... Prediction is one of the oldest and most successful tools in the data compression practitioner's toolbox. It is particularly useful in situations where the data (e.g., a digital image) originates from a natural physical process (e.g., sensed light), and the data samples (e.g., real numbers) represen ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Prediction is one of the oldest and most successful tools in the data compression practitioner's toolbox. It is particularly useful in situations where the data (e.g., a digital image) originates from a natural physical process (e.g., sensed light), and the data samples (e.g., real numbers) represent a continuously varying physical magnitude (e.g., brightness). In these cases, the value of the next sample can often be accurately
Tree Source Identification with the Burrows Wheeler Transform
, 2000
"... We study the identification of a tree source model from a given sequence produced by the source. The Burrows Wheeler transform (BWT) is a reversible blocksorting sequence transformation with O(N) complexity, which rearranges symbols according to the lexicographical order of their contexts. For a t ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We study the identification of a tree source model from a given sequence produced by the source. The Burrows Wheeler transform (BWT) is a reversible blocksorting sequence transformation with O(N) complexity, which rearranges symbols according to the lexicographical order of their contexts. For a tree source, symbols at the BWT output are sorted according to the states in the tree, so the BWT output is similar to piecewise i.i.d. In this work, the i.i.d. segments in the BWT output are estimated by merging similarly distributed blocks. Then, the segment estimates are used to identify states in the tree source, and to estimate the conditional probabilities. The probability of merging blocks incorrectly is bounded, and shown to decay exponentially as the block length is increased. Simulation results show that in practice the decay is much faster than the bound. Lastly, we show that the block merging algorithm lends itself to linear complexity universal coding for tree source inputs, with redundancy comparable to Merhav’s MDL bound.
On the Cost of WorstCase Coding Length Constraints
 IEEE Trans. Information Theory
, 2000
"... It is shown that for any uniquely decipherable code, with a small cost in the expected coding length we can add constraints on the worstcase coding length. Moreover, this cost is related to the Fibonacci numbers. Keywords: data compression, Fibonacci numbers, Hu#man codes, source coding, uniquely d ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
It is shown that for any uniquely decipherable code, with a small cost in the expected coding length we can add constraints on the worstcase coding length. Moreover, this cost is related to the Fibonacci numbers. Keywords: data compression, Fibonacci numbers, Hu#man codes, source coding, uniquely decipherable, universal coding. 1 Introduction A fundamental tradeo# in lossless source coding is that we can compress some of the inputs only if we expand some of the others. This is reasonable because our primary goal is to minimize the expected output coding length. However, in some cases we would not like to expand the data. The trivial code, wherein the output is equal to the input, never expands the coding length, but it never compresses either. A reasonable objective is to compress well, while expanding very little in the worstcase. The tradeo# between the expected coding length and the worstcase coding expansion has received research attention. In [1] an algorithm for finding a cod...
On the Estimation and Model Costs in Lossless Universal Image Data Compression by Context Weighting
 in Proceedings Symposium on Image Analysis, Swedish Society for Automated Image Analysis
, 1996
"... this paper we consider some aspects of image data compression by lossless techniques. We refer to [1] for a review of the present stateoftheart regarding compression techniques for still images. The problem that we discuss in this paper is the effect of estimation and model costs on the compressio ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
this paper we consider some aspects of image data compression by lossless techniques. We refer to [1] for a review of the present stateoftheart regarding compression techniques for still images. The problem that we discuss in this paper is the effect of estimation and model costs on the compression (ratio) performance and how different preprocessing methods effect estimation and model cost and compression performance. Towards this end we use the context tree weighting (CTW) method [2] for analysis and actual simulations. The purpose of these investigation is to gain understanding how to apply the CTW algorithm for still image compression.
IEEE TILaNSACTIONS ON INFORMATION THEORY. VOL. 38, NO. 2. MARCH 1992
 IEEE Trans. Inform. Theory
, 1992
"... The Lawrence algorithm is a universal binary variable tofixed length source coding algorithm. Here, a modified version of this algorithm is introduced and its asymptotic performance is investigated. For M (the segment set cardinality) large enough, it is shown that the rate R o as a function of th ..."
Abstract
 Add to MetaCart
The Lawrence algorithm is a universal binary variable tofixed length source coding algorithm. Here, a modified version of this algorithm is introduced and its asymptotic performance is investigated. For M (the segment set cardinality) large enough, it is shown that the rate R o as a function of the source parameter 0 satisfies log log M / 1 e h(o) . 1+ 21og M J' for 0 < 0 < 1. Here h(  ) is the binary entropy function. In addition to this, it is proven that no codes exist that have a better asymptotic performance, thereby establishing the asymptotic optimality of our modified Lawrence code. The asymptotic bounds show that universal variabletofixed length codes can have a significantly lower redundancy than universal fixedtovariable length codes with the same number of codewords.
Lossless Compression for Sources with TwoSided Geometric Distributions
 IEEE TRANS. INFORM. THEORY. AVAILABLE AS
, 1998
"... Lossless compression is studied for a countably infinite alphabet source with an offcentered, twosided geometric (TSG) distribution, which is a commonly used statistical model for image prediction residuals. In the first part of this paper, we demonstrate that arithmetic coding based on a simple s ..."
Abstract
 Add to MetaCart
Lossless compression is studied for a countably infinite alphabet source with an offcentered, twosided geometric (TSG) distribution, which is a commonly used statistical model for image prediction residuals. In the first part of this paper, we demonstrate that arithmetic coding based on a simple strategy of model adaptation, essentially attains the theoretical lower bound to the universal coding redundancy associated with this model. In the second part, we focus on more practical codes for the TSG, that operate on a symbolbysymbol basis. Specifically, we present a complete characterization of minimum expectedlength prefix codes for TSG sources. The family of optimal codes introduced here is an extension of the Golomb codes, which are optimal for onesided geometric distributions of nonnegative integers. As in the onesided case, the resulting optimum Huffman tree has a structure that enables simple calculation of the codeword of every given source symbol. Our characterization avoi...