Results 1  10
of
29
A general framework for codes involving redundancy minimization
 IEEE Transactions on Information Theory
, 2006
"... Abstract — A framework with two scalar parameters is introduced for various problems of finding a prefix code minimizing a coding penalty function. The framework involves a twoparameter class encompassing problems previously proposed by Huffman [1], Campbell [2], Nath [3], and Drmota and Szpankowsk ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
Abstract — A framework with two scalar parameters is introduced for various problems of finding a prefix code minimizing a coding penalty function. The framework involves a twoparameter class encompassing problems previously proposed by Huffman [1], Campbell [2], Nath [3], and Drmota and Szpankowski [4]. It sheds light on the relationships among these problems. In particular, Nath’s problem can be seen as bridging that of Huffman with that of Drmota and Szpankowski. This leads to a lineartime algorithm for the last of these with a solution that solves a range of Nath subproblems. We find simple bounds and lineartime Huffmanlike optimization algorithms for all nontrivial problems within the class.
The minimum average code for finite memoryless monotone sources
 in Proc., IEEE Information Theory Workshop
, 2002
"... Abstract—The problem of selecting a code for finite monotone sources with x symbols is considered. The selection criterion is based on minimizing the average redundancy (called Minave criterion) instead of its maximum (i.e., Minimax criterion). The average probability distribution € x, whose associa ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract—The problem of selecting a code for finite monotone sources with x symbols is considered. The selection criterion is based on minimizing the average redundancy (called Minave criterion) instead of its maximum (i.e., Minimax criterion). The average probability distribution € x, whose associated Huffman code has the minimum average redundancy, is derived. The entropy of the average distribution (i.e.,
Precise Asymptotic Analysis of the Tunstall Code
"... Abstract — We study the Tunstall code using the machinery from the analysis of algorithms literature. In particular, we propose an algebraic characterization of the Tunstall code which, together with tools like the Mellin transform and the Tauberian theorems, leads to new results on the variance and ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Abstract — We study the Tunstall code using the machinery from the analysis of algorithms literature. In particular, we propose an algebraic characterization of the Tunstall code which, together with tools like the Mellin transform and the Tauberian theorems, leads to new results on the variance and a central limit theorem for dictionary phrase lengths. This analysis also provides a new argument for obtaining asymptotic results about the mean dictionary phrase length and average redundancy rates. I.
Minimum Expected Length of FixedtoVariable Lossless Compression of Memoryless Sources
"... Abstract—Conventional wisdom states that the minimum expected length for fixedtovariable length encoding of an nblock memoryless source with entropy H grows as nH+O(1). However, this performance is obtained under the constraint that the code assigned to the whole nblock is a prefix code. Droppin ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract—Conventional wisdom states that the minimum expected length for fixedtovariable length encoding of an nblock memoryless source with entropy H grows as nH+O(1). However, this performance is obtained under the constraint that the code assigned to the whole nblock is a prefix code. Dropping this unnecessary constraint we show that the minimum expected length grows as nH − 1 log n + O(1) 2 unless the source is equiprobable. I.
Tunstall Code, Khodak Variations, and random Walks
, 2008
"... A variabletofixed length encoder partitions the source string into variablelength phrases that belong to a given and fixed dictionary. Tunstall, and independently Khodak, designed variabletofixed length codes for memoryless sources that are optimal under certain constraints. In this paper, we s ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
A variabletofixed length encoder partitions the source string into variablelength phrases that belong to a given and fixed dictionary. Tunstall, and independently Khodak, designed variabletofixed length codes for memoryless sources that are optimal under certain constraints. In this paper, we study the Tunstall and Khodak codes using analytic information theory, i.e., the machinery from the analysis of algorithms literature. After proposing an algebraic characterization of the Tunstall and Khodak codes, we present new results on the variance and a central limit theorem for dictionary phrase lengths. This analysis also provides a new argument for obtaining asymptotic results about the mean dictionary phrase length and average redundancy rates.
Optimal prefix codes for infinite alphabets with nonlinear costs
 IEEE Trans. Inf. Theory
, 2008
"... Abstract — Let P = {p(i)} be a measure of strictly positive probabilities on the set of nonnegative integers. Although the countable number of inputs prevents usage of the Huffman algorithm, there are nontrivial P for which known methods find a source code that is optimal in the sense of minimizing ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract — Let P = {p(i)} be a measure of strictly positive probabilities on the set of nonnegative integers. Although the countable number of inputs prevents usage of the Huffman algorithm, there are nontrivial P for which known methods find a source code that is optimal in the sense of minimizing expected codeword length. For some applications, however, a source code should instead minimize one of a family of nonlinear objective P functions, βexponential means, those of the form loga i p(i)an(i) , where n(i) is the length of the ith codeword and a is a positive constant. Applications of such minimizations include a novel problem of maximizing the chance of message receipt in singleshot communications (a < 1) and a previously known problem of minimizing the chance of buffer overflow in a queueing system (a> 1). This paper introduces methods for finding codes optimal for such exponential means. One method applies to geometric distributions, while another applies to distributions with lighter tails. The latter algorithm is applied to Poisson distributions and both are extended to alphabetic codes, as well as to minimizing maximum pointwise redundancy. The aforementioned application of minimizing the chance of buffer overflow is also considered. Index Terms — Communication networks, generalized entropies, generalized means, Golomb codes, Huffman algorithm,
Tight bounds on minimum maximum pointwise redundancy
 In Proceedings of the International Symposium on Information Theory
, 1944
"... Abstract — This paper presents new lower and upper bounds for the optimal compression of binary prefix codes in terms of the most probable input symbol, where compression efficiency is determined by the nonlinear codeword length objective of minimizing maximum pointwise redundancy. This objective re ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract — This paper presents new lower and upper bounds for the optimal compression of binary prefix codes in terms of the most probable input symbol, where compression efficiency is determined by the nonlinear codeword length objective of minimizing maximum pointwise redundancy. This objective relates to both universal modeling and Shannon coding, and these bounds are tight throughout the interval. The upper bounds also apply to a related objective, that of d th exponential redundancy. I.
On Universal Coding of Unordered Data
"... Abstract — There are several applications in information transfer and storage where the order of source letters is irrelevant at the destination. For these sourcedestination pairs, multiset communication rather than the more difficult task of sequence communication may be performed. In this work, w ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract — There are several applications in information transfer and storage where the order of source letters is irrelevant at the destination. For these sourcedestination pairs, multiset communication rather than the more difficult task of sequence communication may be performed. In this work, we study universal multiset communication. For classes of countablealphabet sources that meet Kieffer’s condition for sequence communication, we present a scheme that universally achieves a rate of n + o(n) bits per multiset letter for multiset communication. We also define redundancy measures that are normalized by the logarithm of the multiset size rather than per multiset letter and show that these redundancy measures cannot be driven to zero for the class of finitealphabet memoryless multisets. This further implies that finitealphabet memoryless multisets cannot be encoded universally with vanishing fractional redundancy. I.
Average Redundancy for Known Sources: Ubiquitous Trees in Source Coding
, 2008
"... Analytic information theory aims at studying problems of information theory using analytic techniques of computer science and combinatorics. Following Hadamard’s precept, these problems are tackled by complex analysis methods such as generating functions, Mellin transform, Fourier series, saddle poi ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Analytic information theory aims at studying problems of information theory using analytic techniques of computer science and combinatorics. Following Hadamard’s precept, these problems are tackled by complex analysis methods such as generating functions, Mellin transform, Fourier series, saddle point method, analytic poissonization and depoissonization, and singularity analysis. This approach lies at the crossroad of computer science and information theory. In this survey we concentrate on one facet of information theory (i.e., source coding better known as data compression), namely the redundancy rate problem. The redundancy rate problem determines by how much the actual code length exceeds the optimal code length. We further restrict our interest to the average redundancy for known sources, that is, when statistics of information sources are known. We present precise analyses of three types of lossless data compression schemes, namely fixedtovariable (FV) length codes, variabletofixed (VF) length codes, and variabletovariable (VV) length codes. In particular, we investigate average redundancy of Huffman, Tunstall, and Khodak codes. These codes have succinct representations as trees, either as coding or parsing trees, and we analyze here some of their parameters (e.g., the average path from the root to a leaf).