Results 1 
9 of
9
Universal compression of memoryless sources over unknown alphabets
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2004
"... It has long been known that the compression redundancy of independent and identically distributed (i.i.d.) strings increases to infinity as the alphabet size grows. It is also apparent that any string can be described by separately conveying its symbols, and its pattern—the order in which the symbol ..."
Abstract

Cited by 32 (10 self)
 Add to MetaCart
It has long been known that the compression redundancy of independent and identically distributed (i.i.d.) strings increases to infinity as the alphabet size grows. It is also apparent that any string can be described by separately conveying its symbols, and its pattern—the order in which the symbols appear. Concentrating on the latter, we show that the patterns of i.i.d. strings over all, including infinite and even unknown, alphabets, can be compressed with diminishing redundancy, both in block and sequentially, and that the compression can be performed in linear time. To establish these results, we show that the number of patterns is the Bell number, that the number of patterns with a given number of symbols is the Stirling number of the second kind, and that the redundancy of patterns can be bounded using results of Hardy and Ramanujan on the number of integer partitions. The results also imply an asymptotically optimal solution for the GoodTuring probabilityestimation problem.
Limit results on pattern entropy
 IEEE Trans. Inf. Theory
, 2006
"... We determine the entropy rate of patterns of certain random processes, bound the speed at which the persymbol pattern entropy converges to this rate, and show that patterns satisfy an asymptotic equipartition property. To derive some of these results we upper bound the probability that the n ′ th v ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
We determine the entropy rate of patterns of certain random processes, bound the speed at which the persymbol pattern entropy converges to this rate, and show that patterns satisfy an asymptotic equipartition property. To derive some of these results we upper bound the probability that the n ′ th variable in a random process differs from all preceding ones.
A lower bound on compression of unknown alphabets
 Theoret. Comput. Sci
, 2005
"... Many applications call for universal compression of strings over large, possibly infinite, alphabets. However, it has long been known that the resulting redundancy is infinite even for i.i.d. distributions. It was recently shown that the redudancy of the strings ’ patterns, which abstract the values ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Many applications call for universal compression of strings over large, possibly infinite, alphabets. However, it has long been known that the resulting redundancy is infinite even for i.i.d. distributions. It was recently shown that the redudancy of the strings ’ patterns, which abstract the values of the symbols, retaining only their relative precedence, is sublinear in the blocklength n, hence the persymbol redundancy diminishes to zero. In this paper we show that pattern redundancy is at least (1.5 log 2 e) n 1/3 bits. To do so, we construct a generating function whose coefficients lower bound the redundancy, and use Hayman’s saddlepoint approximation technique to determine the coefficients ’ asymptotic behavior. 1
Adaptive Coding and Prediction of Sources With Large and Infinite Alphabets
"... Abstract—The problem of predicting a sequence x;x;...generated by a discrete source with unknown statistics is considered. Each letter x is predicted using the information on the word x x 111x only. This problem is of great importance for data compression, because of its use to estimate probability ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract—The problem of predicting a sequence x;x;...generated by a discrete source with unknown statistics is considered. Each letter x is predicted using the information on the word x x 111x only. This problem is of great importance for data compression, because of its use to estimate probability distributions for PPM algorithms and other adaptive codes. On the other hand, such prediction is a classical problem which has received much attention. Its history can be traced back to Laplace. We address the problem where the sequence is generated by an independent and identically distributed (i.i.d.) source with some large (or even infinite) alphabet and suggest a class of new methods of prediction. Index Terms—Adaptive coding, Laplace problem of succession, lossless data compression, prediction of random processes, Shannon entropy, source coding. I.
Minimax Redundancy for Large Alphabets
"... Abstract—We study the minimax redundancy of universal coding for large alphabets over memoryless sources and present two main results: We first complete studies initiated in Orlitsky and Santhanam [12] deriving precise asymptotics of the minimax redundancy for all ranges of the alphabet sizes. Secon ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract—We study the minimax redundancy of universal coding for large alphabets over memoryless sources and present two main results: We first complete studies initiated in Orlitsky and Santhanam [12] deriving precise asymptotics of the minimax redundancy for all ranges of the alphabet sizes. Second, we consider the minimax redundancy of a source model in which some symbol probabilities are fixed. The latter model leads to an interesting binomial sum asymptotics with superexponential growth functions. Our findings could be used to approximate numerically the minimax redundancy for various ranges of the sequence length and the alphabet size. These results are obtained by analytic techniques such as treelike generating functions and the saddle point method. I.
Universal Codes for Finite Sequences of Integers Drawn from a Monotone Distribution
 IEEE Trans. on Info. Theory
, 2002
"... We o#er two noiseless codes for blocks of integers X = (X 1 , . . . , X n ). We provide explicit bounds on the relative redundancy that are valid for any distribution F in the class of memoryless sources with a possibly infinite alphabet whose marginal distribution is monotone. Specifically we sho ..."
Abstract
 Add to MetaCart
We o#er two noiseless codes for blocks of integers X = (X 1 , . . . , X n ). We provide explicit bounds on the relative redundancy that are valid for any distribution F in the class of memoryless sources with a possibly infinite alphabet whose marginal distribution is monotone. Specifically we show that the expected code length L(X ) of our first universal code is dominated by a linear function of the entropy of X . Further, we present a second universal code that is e#cient in that its length is bounded by nH F + o(n H F ), where H F is the entropy of F which is allowed to vary with n. Since these bounds hold for any n and any monotone F we are able to show that our codes are strongly minimax with respect to relative redundancy (as defined by Elias). Key Phrases: Universal noiseless coding of integers, Elias codes, Wyner's inequality, relative redundancy, strongly minimax. # Version Id: blockCode.tex,v 1.31 2001/10/24 16:06:28 bob Exp 1
Adaptive compression against a countable alphabet
"... This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and nondecreasing hazard rate. We prove that the autocensuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the coll ..."
Abstract
 Add to MetaCart
This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and nondecreasing hazard rate. We prove that the autocensuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the collection of such classes. The analysis builds on the tight characterization of universal redundancy rate in terms of metric entropy by Haussler and Opper (1997) and on a careful analysis of the performance of the ACcoding algorithm. The latter relies on nonasymptotic bounds for maxima of samples from discrete distributions with finite and nondecreasing hazard rate.
About Adaptive Coding on Countable Alphabets
, 2012
"... This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and nondecreasing hazard rate. We prove that the autocensuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the coll ..."
Abstract
 Add to MetaCart
This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and nondecreasing hazard rate. We prove that the autocensuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the collection of such classes. The analysis builds on the tight characterization of universal redundancy rate in terms of metric entropy by Haussler and Opper (1997) and on a careful analysis of the performance of the ACcoding algorithm. The latter relies on nonasymptotic bounds for maxima of samples from discrete distributions with finite and nondecreasing hazard rate.
Electronic Journal of Statistics ISSN: 19357524 A BernsteinVon Mises Theorem for
"... Abstract: We investigate the asymptotic normality of the posterior distribution in the discrete setting, when model dimension increases with sample size. We consider a probability mass function θ0 on N \ {0} and a sequence of truncation levels (kn)n satisfying k 3 n ≤ n infi≤kn θ0(i). Let ˆ θ denote ..."
Abstract
 Add to MetaCart
Abstract: We investigate the asymptotic normality of the posterior distribution in the discrete setting, when model dimension increases with sample size. We consider a probability mass function θ0 on N \ {0} and a sequence of truncation levels (kn)n satisfying k 3 n ≤ n infi≤kn θ0(i). Let ˆ θ denote the maximum likelihood estimate of (θ0(i))i≤kn and let ∆n(θ0) denote the kndimensional vector which ith coordinate is defined by √ n ( ˆ θn(i) − θ0(i)) for 1 ≤ i ≤ kn. We check that under mild conditions on θ0 and on the sequence of prior probabilities on the kndimensional simplices, after centering and rescaling, the variation distance between the posterior distribution recentered around ˆ θn and rescaled by √ n and the kndimensional Gaussian distribution N (∆n(θ0), I −1 (θ0)) converges in probability to 0. This theorem can be used to prove the asymptotic normality of Bayesian estimators of Shannon and Rényi entropies. The proofs are based on concentration inequalities for centered and noncentered Chisquare (Pearson) statistics. The latter allow to establish posterior