Results 1  10
of
10
Universal compression of memoryless sources over unknown alphabets
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2004
"... It has long been known that the compression redundancy of independent and identically distributed (i.i.d.) strings increases to infinity as the alphabet size grows. It is also apparent that any string can be described by separately conveying its symbols, and its pattern—the order in which the symbol ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
It has long been known that the compression redundancy of independent and identically distributed (i.i.d.) strings increases to infinity as the alphabet size grows. It is also apparent that any string can be described by separately conveying its symbols, and its pattern—the order in which the symbols appear. Concentrating on the latter, we show that the patterns of i.i.d. strings over all, including infinite and even unknown, alphabets, can be compressed with diminishing redundancy, both in block and sequentially, and that the compression can be performed in linear time. To establish these results, we show that the number of patterns is the Bell number, that the number of patterns with a given number of symbols is the Stirling number of the second kind, and that the redundancy of patterns can be bounded using results of Hardy and Ramanujan on the number of integer partitions. The results also imply an asymptotically optimal solution for the GoodTuring probabilityestimation problem.
Limit results on pattern entropy
 IEEE Trans. Inf. Theory
, 2006
"... We determine the entropy rate of patterns of certain random processes, bound the speed at which the persymbol pattern entropy converges to this rate, and show that patterns satisfy an asymptotic equipartition property. To derive some of these results we upper bound the probability that the n ′ th v ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
We determine the entropy rate of patterns of certain random processes, bound the speed at which the persymbol pattern entropy converges to this rate, and show that patterns satisfy an asymptotic equipartition property. To derive some of these results we upper bound the probability that the n ′ th variable in a random process differs from all preceding ones.
Universal lossless compression with unknown alphabets  The average case
, 2006
"... Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive indices in increasing order of first occurrence. If the alphabe ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive indices in increasing order of first occurrence. If the alphabet of a source that generated a sequence is unknown, the inevitable cost of coding the unknown alphabet symbols can be exploited to create the pattern of the sequence. This pattern can in turn be compressed by itself. It is shown that if the alphabet size k is essentially small, then the average minimax and maximin redundancies as well as the redundancy of every code for almost every source, when compressing a pattern, consist of at least 0.5 log ( n/k 3) bits per each unknown probability parameter, and if all alphabet letters are likely to occur, there exist codes whose redundancy is at most 0.5 log ( n/k 2) bits per each unknown probability parameter, where n is the length of the data sequences. Otherwise, if the alphabet is large, these redundancies are essentially at least O ( n −2/3) bits per symbol, and there exist codes that achieve redundancy of essentially O ( n −1/2) bits per symbol. Two suboptimal lowcomplexity sequential algorithms for compression of patterns are presented and their description lengths
On the entropy rate of pattern processes
 Proceedings of the 2005 Data Compression Conference, Snowbird
, 2005
"... We study the entropy rate of pattern sequences of stochastic processes, and its relationship to the entropy rate of the original process. We give a complete characterization of this relationship for i.i.d. processes over arbitrary alphabets, stationary ergodic processes over discrete alphabets, and ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
We study the entropy rate of pattern sequences of stochastic processes, and its relationship to the entropy rate of the original process. We give a complete characterization of this relationship for i.i.d. processes over arbitrary alphabets, stationary ergodic processes over discrete alphabets, and a broad family of stationary ergodic processes over uncountable alphabets. For cases where the entropy rate of the pattern process is infinite, we characterize the possible growth rate of the block entropy. 1
Universal compression of Markov and related sources over arbitrary alphabets
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2006
"... Recent work has considered encoding a string by separately conveying its symbols and its pattern—the order in which the symbols appear. It was shown that the patterns of i.i.d. strings can be losslessly compressed with diminishing persymbol redundancy. In this paper the pattern redundancy of distri ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Recent work has considered encoding a string by separately conveying its symbols and its pattern—the order in which the symbols appear. It was shown that the patterns of i.i.d. strings can be losslessly compressed with diminishing persymbol redundancy. In this paper the pattern redundancy of distributions with memory is considered. Close lower and upper bounds are established on the pattern redundancy of strings generated by Hidden Markov Models with a small number of states, showing in particular that their persymbol pattern redundancy diminishes with increasing string length. The upper bounds are obtained by analyzing the growth rate of the number of multidimensional integer partitions, and the lower bounds, using Hayman’s Theorem.
A better GoodTuring estimator for sequence probabilities
 in Proc. IEEE Int. Symp. Inf. Theory
"... ..."
Adaptive Coding and Prediction of Sources With Large and Infinite Alphabets
"... Abstract—The problem of predicting a sequence x;x;...generated by a discrete source with unknown statistics is considered. Each letter x is predicted using the information on the word x x 111x only. This problem is of great importance for data compression, because of its use to estimate probability ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract—The problem of predicting a sequence x;x;...generated by a discrete source with unknown statistics is considered. Each letter x is predicted using the information on the word x x 111x only. This problem is of great importance for data compression, because of its use to estimate probability distributions for PPM algorithms and other adaptive codes. On the other hand, such prediction is a classical problem which has received much attention. Its history can be traced back to Laplace. We address the problem where the sequence is generated by an independent and identically distributed (i.i.d.) source with some large (or even infinite) alphabet and suggest a class of new methods of prediction. Index Terms—Adaptive coding, Laplace problem of succession, lossless data compression, prediction of random processes, Shannon entropy, source coding. I.
Limit Results on Pattern Entropy
"... Abstract — We determine the entropy rate of patterns of i.i.d. strings and show that they satisfy an asymptotic equipartition property. I. ..."
Abstract
 Add to MetaCart
Abstract — We determine the entropy rate of patterns of i.i.d. strings and show that they satisfy an asymptotic equipartition property. I.
Abstract While deciphering the Enigma Code during World
"... problem of estimating a probability distribution from a sample of data. They derived a surprising and unintuitive formula that has since been used in a variety of applications and studied by a number of researchers. Borrowing an informationtheoretic and machinelearning framework, we define the att ..."
Abstract
 Add to MetaCart
problem of estimating a probability distribution from a sample of data. They derived a surprising and unintuitive formula that has since been used in a variety of applications and studied by a number of researchers. Borrowing an informationtheoretic and machinelearning framework, we define the attenuation of a probability estimator as the largest possible ratio between the persymbol probability assigned to an arbitrarilylong sequence by any distribution, and the corresponding probability assigned by the estimator. We show that some common estimators have infinite attenuation and that the attenuation of the GoodTuring estimator is low, yet larger than one. We then derive an estimator whose attenuation is one, namely, as the length of any sequence increases, the persymbol probability assigned by the estimator is as high as possible. Interestingly, some of the proofs use celebrated results by Hardy and Ramanujan on the number of partitions of an integer. To better understand the behavior of the estimator, we study the probability it assigns to several simple sequences. We show that for some sequences this probability agrees with our intuition, while for others it is rather unexpected. 1.
Sequence Probability Estimation for Large Alphabets
, 704
"... Abstract — We consider the problem of estimating the probability of an observed string drawn i.i.d. from an unknown distribution. The key feature of our study is that the length of the observed string is assumed to be of the same order as the size of the underlying alphabet. In this setting, many le ..."
Abstract
 Add to MetaCart
Abstract — We consider the problem of estimating the probability of an observed string drawn i.i.d. from an unknown distribution. The key feature of our study is that the length of the observed string is assumed to be of the same order as the size of the underlying alphabet. In this setting, many letters are unseen and the empirical distribution tends to overestimate the probability of the observed letters. To overcome this problem, the traditional approach to probability estimation is to use the classical GoodTuring estimator. We introduce a natural scaling model and use it to show that the GoodTuring sequence probability estimator is not consistent. We then introduce a novel sequence probability estimator that is indeed consistent under the natural scaling model. I.