Results 11  20
of
36
The precise minimax redundancy
 In Proceedings of IEEE Symposium on Information Theory
, 2002
"... ..."
Robustly Minimax Codes for Universal Data Compression', The 21 'st
 Symposium on Information Theory and Its Applications
, 1998
"... Abstract — We introduce a notion of ‘relative redundancy’ for universal data compression and propose a universal code which asymptotically achieves the minimax value of the relative redundancy. The relative redundancy is a hybrid of redundancy and coding regret (pointwise redundancy), where a class ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract — We introduce a notion of ‘relative redundancy’ for universal data compression and propose a universal code which asymptotically achieves the minimax value of the relative redundancy. The relative redundancy is a hybrid of redundancy and coding regret (pointwise redundancy), where a class of information sources and a class of codes are assumed. The minimax code for relative redundancy is an extension of the modified Jeffreys mixture, which was introduced by Takeuchi and Barron and is minimax for regret.
Generalized Shannon Code Minimizes the Maximal Redundancy
 in Proceedings of the Latin American Theoretical Informatics (LATIN) 2002. 2002
, 2001
"... Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Huffman constructed his optimal code that minimizes the average code length among all prefix codes for known sources. A ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Huffman constructed his optimal code that minimizes the average code length among all prefix codes for known sources. Actually, Huffman codes minimizes the average redundancy defined as the difference between the code length and the entropy of the source. Interestingly enough, no optimal code is known for other popular optimization criterion such as the maximal redundancy defined as the maximum of the pointwise redundancy over all source sequences. We first prove that a generalized Shannon code minimizes the maximal redundancy among all prefix codes, and present an efficient implementation of the optimal code. Then we compute precisely its redundancy for memoryless sources. Finally, we study universal codes for unknown source distributions. We adopt the minimax approach and search for the best code for the worst source. We establish that such redundancy is a sum of the likelihood estimator and the redundancy of the generalize code computed for the maximum likelihood distribution. This replaces Shtarkov's bound by an exact formula. We also compute precisely the maximal minimax for a class of memoryless sources. The main findings of this paper are established by techniques that belong to the toolkit of the "analytic analysis of algorithms" such as theory of distribution of sequences modulo 1 and Fourier series. These methods have already found applications in other problems of information theory, and they constitute the so called analytic information theory.
Properties of Jeffreys mixture for Markov sources
 Proc. of the fourth Workshop on InformationBased Induction Sciences (IBIS2001
, 2001
"... Abstract: We discuss the properties of Jeffreys mixture for general FSMX model (a certain class of Markov sources [11]). First, we show that modified Jeffreys mixture asymptotically achieves the minimax coding regret [7], where we do not put any restriction on data sequences at all. This is extensi ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
Abstract: We discuss the properties of Jeffreys mixture for general FSMX model (a certain class of Markov sources [11]). First, we show that modified Jeffreys mixture asymptotically achieves the minimax coding regret [7], where we do not put any restriction on data sequences at all. This is extension of results in [13, 15]. Then, we give an approximation formula for the prediction probability of Jeffreys mixture for FSMX models (review of the result in [10, 19]). By this formula, it is revealed that the prediction probability by Jeffreys mixture for the first order Markov chain with alphabet {0, 1} is not of the form (k + α)/(n+ β) (n is data size, k is number of occurrences of ‘1’). Moreover, we evaluate by simulation the regret of our approximation formula for the first order Markov chain and show that the prediction strategy using our approximation formula gives smaller coding regret than the one using Laplace estimator. 1
About Adaptive Coding on Countable Alphabets
, 2012
"... This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and nondecreasing hazard rate. We prove that the autocensuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the coll ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and nondecreasing hazard rate. We prove that the autocensuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the collection of such classes. The analysis builds on the tight characterization of universal redundancy rate in terms of metric entropy by Haussler and Opper (1997) and on a careful analysis of the performance of the ACcoding algorithm. The latter relies on nonasymptotic bounds for maxima of samples from discrete distributions with finite and nondecreasing hazard rate.
How to achieve minimax expected Kullback– Leibler Distance from an unknown finite distribution
 In ”Algorithmic Learning Theory”, Proceedings of the 13th International Conference ALT 2002
, 1991
"... Abstract. We consider a problem that is related to the “Universal Encoding Problem ” from information theory. The basic goal is to find rules that map “partial information ” about a distribution X over an mletter alphabet into a guess X ̂ for X such that the KullbackLeibler divergence between X a ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Abstract. We consider a problem that is related to the “Universal Encoding Problem ” from information theory. The basic goal is to find rules that map “partial information ” about a distribution X over an mletter alphabet into a guess X ̂ for X such that the KullbackLeibler divergence between X and X ̂ is as small as possible. The cost associated with a rule is the maximal expected KullbackLeibler divergence between X and X̂. First, we show that the cost associated with the wellknown addone rule equals ln(1 + (m − 1)/(n+ 1)) thereby extending a result of Forster and Warmuth [3, 2] to m ≥ 3. Second, we derive an absolute (as opposed to asymptotic) lower bound on the smallest possible cost. Technically, this is done by determining (almost exactly) the Bayes error of the addone rule with a uniform prior (where the asymptotics for n→ ∞ was known before). Third, we hint to tools from approximation theory and support the conjecture that there exists a rule whose cost asymptotically matches the theoretical barrier from the lower bound. 1
THE MDL PRINCIPLE, PENALIZED LIKELIHOODS, AND STATISTICAL RISK
"... ABSTRACT. We determine, for both countable and uncountable collections of functions, informationtheoretic conditions on a penalty pen(f) such that the optimizer ˆ f of the penalized log likelihood criterion log 1/likelihood(f) + pen(f) has statistical risk not more than the index of resolvability co ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
ABSTRACT. We determine, for both countable and uncountable collections of functions, informationtheoretic conditions on a penalty pen(f) such that the optimizer ˆ f of the penalized log likelihood criterion log 1/likelihood(f) + pen(f) has statistical risk not more than the index of resolvability corresponding to the accuracy of the optimizer of the expected value of the criterion. If F is the linear span of a dictionary of functions, traditional descriptionlength penalties are based on the number of nonzero terms of candidate fits (the ℓ0 norm of the coefficients) as we review. We specialize our general conclusions to show the ℓ1 norm of the coefficients times a suitable multiplier λ is also an informationtheoretically valid penalty. 1.
Average redundancy for known sources: ubiquitous trees in source coding
 Proceedings, Fifth Colloquium on Mathematics and Computer Science (Blaubeuren, 2008), Discrete Math. Theor. Comput. Sci. Proc. AI
, 2008
"... Analytic information theory aims at studying problems of information theory using analytic techniques of computer science and combinatorics. Following Hadamard’s precept, these problems are tackled by complex analysis methods such as generating functions, Mellin transform, Fourier series, saddle poi ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Analytic information theory aims at studying problems of information theory using analytic techniques of computer science and combinatorics. Following Hadamard’s precept, these problems are tackled by complex analysis methods such as generating functions, Mellin transform, Fourier series, saddle point method, analytic poissonization and depoissonization, and singularity analysis. This approach lies at the crossroad of computer science and information theory. In this survey we concentrate on one facet of information theory (i.e., source coding better known as data compression), namely the redundancy rate problem. The redundancy rate problem determines by how much the actual code length exceeds the optimal code length. We further restrict our interest to the average redundancy for known sources, that is, when statistics of information sources are known. We present precise analyses of three types of lossless data compression schemes, namely fixedtovariable (FV) length codes, variabletofixed (VF) length codes, and variabletovariable (VV) length codes. In particular, we investigate average redundancy of Huffman, Tunstall, and Khodak codes. These codes have succinct representations as trees, either as coding or parsing trees, and we analyze here some of their parameters (e.g., the average path from the root to a leaf).
Nonsubjective priors via predictive relative entropy regret
 DEPARTMENT OF STATISTICS THE WHARTON SCHOOL UNIVERSITY OF PENNSYLVANIA PHILADELPHIA, PENNSYLVANIA 191046340 USA EMAIL: edgeorge@wharton.upenn.edu F. LIANG INSTITUTE OF STATISTICS AND DECISION SCIENCES DUKE
, 2006
"... We explore the construction of nonsubjective prior distributions in Bayesian statistics via a posterior predictive relative entropy regret criterion. We carry out a minimax analysis based on a derived asymptotic predictive loss function and show that this approach to prior construction has a number ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
We explore the construction of nonsubjective prior distributions in Bayesian statistics via a posterior predictive relative entropy regret criterion. We carry out a minimax analysis based on a derived asymptotic predictive loss function and show that this approach to prior construction has a number of attractive features. The approach here differs from previous work that uses either prior or posterior relative entropy regret in that we consider predictive performance in relation to alternative nondegenerate prior distributions. The theory is illustrated with an analysis of some specific examples. 1. Introduction. There