Results 1  10
of
26
Quantization
 IEEE TRANS. INFORM. THEORY
, 1998
"... The history of the theory and practice of quantization dates to 1948, although similar ideas had appeared in the literature as long ago as 1898. The fundamental role of quantization in modulation and analogtodigital conversion was first recognized during the early development of pulsecode modula ..."
Abstract

Cited by 639 (11 self)
 Add to MetaCart
The history of the theory and practice of quantization dates to 1948, although similar ideas had appeared in the literature as long ago as 1898. The fundamental role of quantization in modulation and analogtodigital conversion was first recognized during the early development of pulsecode modulation systems, especially in the 1948 paper of Oliver, Pierce, and Shannon. Also in 1948, Bennett published the first highresolution analysis of quantization and an exact analysis of quantization noise for Gaussian processes, and Shannon published the beginnings of rate distortion theory, which would provide a theory for quantization as analogtodigital conversion and as data compression. Beginning with these three papers of fifty years ago, we trace the history of quantization from its origins through this decade, and we survey the fundamentals of the theory and many of the popular and promising techniques for quantization.
Clustering with Bregman Divergences
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergence ..."
Abstract

Cited by 310 (52 self)
 Add to MetaCart
A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose and analyze parametric hard and soft clustering algorithms based on a large class of distortion functions known as Bregman divergences. The proposed algorithms unify centroidbased parametric clustering approaches, such as classical kmeans and informationtheoretic clustering, which arise by special choices of the Bregman divergence. The algorithms maintain the simplicity and scalability of the classical kmeans algorithm, while generalizing the basic idea to a very large class of clustering loss functions. There are two main contributions in this paper. First, we pose the hard clustering problem in terms of minimizing the loss in Bregman information, a quantity motivated by ratedistortion theory, and present an algorithm to minimize this loss. Secondly, we show an explicit bijection between Bregman divergences and exponential families. The bijection enables the development of an alternative interpretation of an ecient EM scheme for learning models involving mixtures of exponential distributions. This leads to a simple soft clustering algorithm for all Bregman divergences.
Lossy Source Coding
 IEEE Trans. Inform. Theory
, 1998
"... Lossy coding of speech, highquality audio, still images, and video is commonplace today. However, in 1948, few lossy compression systems were in service. Shannon introduced and developed the theory of source coding with a fidelity criterion, also called ratedistortion theory. For the first 25 year ..."
Abstract

Cited by 71 (1 self)
 Add to MetaCart
Lossy coding of speech, highquality audio, still images, and video is commonplace today. However, in 1948, few lossy compression systems were in service. Shannon introduced and developed the theory of source coding with a fidelity criterion, also called ratedistortion theory. For the first 25 years of its existence, ratedistortion theory had relatively little impact on the methods and systems actually used to compress real sources. Today, however, ratedistortion theoretic concepts are an important component of many lossy compression techniques and standards. We chronicle the development of ratedistortion theory and provide an overview of its influence on the practice of lossy source coding. Index TermsData compression, image coding, speech coding, rate distortion theory, signal coding, source coding with a fidelity criterion, video coding. I.
Segmentation of multivariate mixed data via lossy coding and compression
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmen ..."
Abstract

Cited by 68 (13 self)
 Add to MetaCart
Abstract—In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and ratedistortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phasetransitionlike behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data. Index Terms—Multivariate mixed data, data segmentation, data clustering, rate distortion, lossy coding, lossy compression, image segmentation, microarray data clustering. 1
Competitive Learning Algorithms for Robust Vector Quantization
, 1998
"... The efficient representation and encoding of signals with limited resources, e.g., finite storage capacity and restricted transmission bandwidth, is a fundamental problem in technical as well as biological information processing systems. Typically, under realistic circumstances, the encoding and com ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
The efficient representation and encoding of signals with limited resources, e.g., finite storage capacity and restricted transmission bandwidth, is a fundamental problem in technical as well as biological information processing systems. Typically, under realistic circumstances, the encoding and communication of messages has to deal with different sources of noise and disturbances. In this paper, we propose a unifying approach to data compression by robust vector quantization, which explicitly deals with channel noise, bandwidth limitations, and random elimination of prototypes. The resulting algorithm is able to limit the detrimental effect of noise in a very general communication scenario. In addition, the presented model allows us to derive a novel competitive neural networks algorithm, which covers topology preserving feature maps, the socalled neuralgas algorithm, and the maximum entropy softmax rule as special cases. Furthermore, continuation methods based on these noise models impr...
Additive successive refinement
 IEEE Trans. Inf. Theory
, 1983
"... Abstract—Ratedistortion bounds for scalable coding, and conditions under which they coincide with nonscalable bounds, have been extensively studied. These bounds have been derived for the general treestructured refinement scheme, where reproduction at each layer is an arbitrarily complex function ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Abstract—Ratedistortion bounds for scalable coding, and conditions under which they coincide with nonscalable bounds, have been extensively studied. These bounds have been derived for the general treestructured refinement scheme, where reproduction at each layer is an arbitrarily complex function of all encoding indexes up to that layer. However, in most practical applications (e.g., speech coding) “additive ” refinement structures such as the multistage vector quantizer are preferred due to memory limitations. We derive an achievable region for the additive successive refinement problem, and show via a converse result that the ratedistortion bound of additive refinement is above that of treestructured refinement. Necessary and sufficient conditions for the two alphabet sources under the condition () for some letter. For the special cases of squareerror and absoluteerror distortion measures, and subcritical distortion (where the Shannon lower bound (SLB) is tight), we show that successive refinement without rate loss is possible not only in the treestructured sense, but also in the additivecoding sense. We also provide examples which are successively refinable without rate loss for all distortion values, but the optimal refinement is not additive. Index Terms—Additive refinement, multistage vector quantization (MSVQ), rate distortion, scalable source coding, successive refinement. I.
An Identity of Chernoff Bounds with an Interpretation in Statistical Physics and Applications in Information Theory
, 2007
"... An identity between two versions of the Chernoff bound on the probability a certain large deviations event, is established. This identity has an interpretation in statistical physics, namely, an isothermal equilibrium of a composite system that consists of multiple subsystems of particles. Several i ..."
Abstract

Cited by 9 (8 self)
 Add to MetaCart
An identity between two versions of the Chernoff bound on the probability a certain large deviations event, is established. This identity has an interpretation in statistical physics, namely, an isothermal equilibrium of a composite system that consists of multiple subsystems of particles. Several information–theoretic application examples, where the analysis of this large deviations probability naturally arises, are then described from the viewpoint of this statistical mechanical interpretation. This results in several relationships between information theory and statistical physics, which we hope, the reader will find insightful.
Multiple description quantization by deterministic annealing
 IEEE Trans. Inform. Theory
, 2003
"... and convergence theorems, ” IEEE Trans. Inform. Theory, vol. IT24, pp. ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
and convergence theorems, ” IEEE Trans. Inform. Theory, vol. IT24, pp.
An Information Theoretic Analysis of Maximum Likelihood Mixture Estimation for Exponential Families
 In Proc. 21st Int. Conf. Machine Learning
, 2004
"... An important task in unsupervised learning is maximum likelihood mixture estimation (MLME) for exponential families. In this paper, we prove a mathematical equivalence between this MLME problem and the rate distortion problem for Bregman divergences. We also present new theoretical results in ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
An important task in unsupervised learning is maximum likelihood mixture estimation (MLME) for exponential families. In this paper, we prove a mathematical equivalence between this MLME problem and the rate distortion problem for Bregman divergences. We also present new theoretical results in rate distortion theory for Bregman divergences. Further, an analysis of the problems as a tradeo between compression and preservation of information is presented that yields the information bottleneck method as an interesting special case.
Estimation of the ratedistortion function
 2007. [Online]. Available: http://arxiv.org/abs/cs/0702018v1
"... Motivated by questions in lossy data compression and by theoretical considerations, this paper examines the problem of estimating the ratedistortion function of an unknown (not necessarily discretevalued) source from empirical data. The main focus is the behavior of the socalled “plugin ” estimat ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Motivated by questions in lossy data compression and by theoretical considerations, this paper examines the problem of estimating the ratedistortion function of an unknown (not necessarily discretevalued) source from empirical data. The main focus is the behavior of the socalled “plugin ” estimator, which is simply the ratedistortion function of the empirical distribution of the observed data. Sufficient conditions are given for its consistency, and examples are provided to demonstrate that in certain cases it fails to converge to the true ratedistortion function. The analysis of the performance of the plugin estimator is somewhat surprisingly intricate, even for stationary memoryless sources; the underlying mathematical problem is closely related to the classical problem of establishing the consistency of maximum likelihood estimators. General consistency results are given for the plugin estimator applied to a broad class of sources, including all stationary and ergodic ones. A more general class of estimation problems is also considered, arising in the context of lossy data compression when the allowed class of coding distributions is restricted; analogous results are developed for the plugin estimator in that case. Finally, consistency theorems are formulated for modified (e.g., penalized) versions of the plugin estimator, and for estimating the optimal reproduction distribution.