Results 1 - 10
of
71
Divergence measures based on the Shannon entropy
- IEEE Transactions on Information theory
, 1991
"... Abstract-A new class of information-theoretic divergence measures based on the Shannon entropy is introduced. Unlike the well-known Kullback divergences, the new measures do not require the condition of absolute continuity to be satisfied by the probability distributions in-volved. More importantly, ..."
Abstract
-
Cited by 298 (0 self)
- Add to MetaCart
Abstract-A new class of information-theoretic divergence measures based on the Shannon entropy is introduced. Unlike the well-known Kullback divergences, the new measures do not require the condition of absolute continuity to be satisfied by the probability distributions in-volved. More importantly, their close relationship with the variational distance and the probability of misclassification error are established in terms of bounds. These bounds are crucial in many applications of divergence measures. The new measures are also well characterized by the properties of nonnegativity, finiteness, semiboundedness, and boundedness. Index Terms-Divergence, dissimilarity measure, discrimination in-formation, entropy, probability of error bounds. I.
Streaming and sublinear approximation of entropy and information distances
- In ACM-SIAM Symposium on Discrete Algorithms
, 2006
"... In most algorithmic applications which compare two distributions, information theoretic distances are more natural than standard ℓp norms. In this paper we design streaming and sublinear time property testing algorithms for entropy and various information theoretic distances. Batu et al posed the pr ..."
Abstract
-
Cited by 33 (9 self)
- Add to MetaCart
In most algorithmic applications which compare two distributions, information theoretic distances are more natural than standard ℓp norms. In this paper we design streaming and sublinear time property testing algorithms for entropy and various information theoretic distances. Batu et al posed the problem of property testing with respect to the Jensen-Shannon distance. We present optimal algorithms for estimating bounded, symmetric f-divergences (including the Jensen-Shannon divergence and the Hellinger distance) between distributions in various property testing frameworks. Along the way, we close a (log n)/H gap between the upper and lower bounds for estimating entropy H, yielding an optimal algorithm over all values of the entropy. In a data stream setting (sublinear space), we give the first algorithm for estimating the entropy of a distribution. Our algorithm runs in polylogarithmic space and yields an asymptotic constant factor approximation scheme. An integral part of the algorithm is an interesting use of an F0 (the number of distinct elements in a set) estimation algorithm; we also provide other results along the space/time/approximation tradeoff curve. Our results have interesting structural implications that connect sublinear time and space constrained algorithms. The mediating model is the random order streaming model, which assumes the input is a random permutation of a multiset and was first considered by Munro and Paterson in 1980. We show that any property testing algorithm in the combined oracle model for calculating a permutation invariant functions can be simulated in the random order model in a single pass. This addresses a question raised by Feigenbaum et al regarding the relationship between property testing and stream algorithms. Further, we give a polylog-space PTAS for estimating the entropy of a one pass random order stream. This bound cannot be achieved in the combined oracle (generalized property testing) model. 1
Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization
- In Advances in Neural Information Processing Systems (NIPS
, 2007
"... by convex risk minimization ..."
Probability of error, equivocation and the chernoff bound
- IEEE Transactions on Information Theory
, 1970
"... Absfract-Relationships between the probability of error, the equivocation, and the Chemoff bound are examined for the two-hypothesis decision problem. The effect of rejections on these bounds is derived. Finally, the results are extended to the case of any finite number of hypotheses. I. ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Absfract-Relationships between the probability of error, the equivocation, and the Chemoff bound are examined for the two-hypothesis decision problem. The effect of rejections on these bounds is derived. Finally, the results are extended to the case of any finite number of hypotheses. I.
Symmetrizing the Kullback-Leibler Distance
- IEEE Transactions on Information Theory
, 2000
"... We define a new distance measure the resistor-average distance between two probability distributions that is closely related to the Kullback-Leibler distance. While the KullbackLeibler distance is asymmetric in the two distributions, the resistor-average distance is not. It arises from geometric ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
We define a new distance measure the resistor-average distance between two probability distributions that is closely related to the Kullback-Leibler distance. While the KullbackLeibler distance is asymmetric in the two distributions, the resistor-average distance is not. It arises from geometric considerations similar to those used to derive the Chernoff distance. Determining its relation to well-known distance measures reveals a new way to depict how commonly used distance measures relate to each other. 1 Introduction The Kullback-Leibler distance [15, 16] is perhaps the most frequently used information-theoretic "distance" measure from a viewpoint of theory. If p 0 , p 1 are two probability densities, the KullbackLeibler distance is defined to be D(p 1 #p 0 )= # p 1 (x)log p 1 (x) p 0 (x) dx . (1) In this paper, log() has base two. The Kullback-Leibler distance is but one example of the AliSilvey class of information-theoretic distance measures [1], which are defined to ...
Supervised Learning of Quantizer Codebooks by Information Loss Minimization
, 2007
"... This paper proposes a technique for jointly quantizing continuous features and the posterior distributions of their class labels based on minimizing empirical information loss, such that the index K of the quantizer region to which a given feature X is assigned approximates a sufficient statistic fo ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
This paper proposes a technique for jointly quantizing continuous features and the posterior distributions of their class labels based on minimizing empirical information loss, such that the index K of the quantizer region to which a given feature X is assigned approximates a sufficient statistic for its class label Y. We derive an alternating minimization procedure for simultaneously learning codebooks in the Euclidean feature space and in the simplex of posterior class distributions. The resulting quantizer can be used to encode unlabeled points outside the training set and to predict their posterior class distributions, and has an elegant interpretation in terms of lossless source coding. The proposed method is extensively validated on synthetic and real datasets, and is applied to two diverse problems: learning discriminative visual vocabularies for bag-of-features image classification, and image segmentation.
Large Deviations of Divergence Measures on Partitions
, 2000
"... We discuss Chernoff-type large deviation results for the total variation, the I-divergence errors, and the -divergence errors on partitions. In contrast to the total variation and the I-divergence, the divergence has an unconventional large deviation rate. Applications to Bahadur efficiencies ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
We discuss Chernoff-type large deviation results for the total variation, the I-divergence errors, and the -divergence errors on partitions. In contrast to the total variation and the I-divergence, the divergence has an unconventional large deviation rate. Applications to Bahadur efficiencies of goodness-of-fit tests based on these divergence measures for multivariate observations are given.
Toward a Theory of Information Processing
- IEEE Trans. Signal Processing
, 2002
"... Information processing theory endeavors to quantify how well signals encode information and how well systems, by acting on signals, process information. We use information-theoretic distance measures, the Kullback-Leibler distance in particular, to quantify how well signals represent information. ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
Information processing theory endeavors to quantify how well signals encode information and how well systems, by acting on signals, process information. We use information-theoretic distance measures, the Kullback-Leibler distance in particular, to quantify how well signals represent information. The ratio of distances between a system's output and input quantifies the system's information processing properties.
Target tracking using a joint acoustic video system
- Department of Electrical and Computer Engineering, University of Maryland, College
, 2007
"... Abstract—In this paper, a multitarget tracking system for collocated video and acoustic sensors is presented. We formulate the tracking problem using a particle filter based on a state-space approach. We first discuss the acoustic state-space formulation whose observations use a sliding window of di ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Abstract—In this paper, a multitarget tracking system for collocated video and acoustic sensors is presented. We formulate the tracking problem using a particle filter based on a state-space approach. We first discuss the acoustic state-space formulation whose observations use a sliding window of direction-of-arrival estimates. We then present the video state space that tracks a target’s position on the image plane based on online adaptive appearance models. For the joint operation of the filter, we combine the state vectors of the individual modalities and also introduce a time-delay variable to handle the acoustic-video data synchronization issue, caused by acoustic propagation delays. A novel particle filter proposal strategy for joint state-space tracking is introduced, which places the random support of the joint filter where the final posterior is likely to lie. By using the Kullback-Leibler divergence measure, it is shown that the joint operation of the filter decreases the worst case divergence of the individual modalities. The resulting joint tracking filter is quite robust against video and acoustic occlusions due to our proposal strategy. Computer simulations are presented with synthetic and field data to demonstrate the filter’s performance. Index Terms—Acoustic tracking, multimodal data fusion, particle filtering, visual tracking. I.
Sided and symmetrized Bregman centroids
- IEEE Transactions on Information Theory
, 2009
"... Abstract—In this paper, we generalize the notions of centroids (and barycenters) to the broad class of information-theoretic distortion measures called Bregman divergences. Bregman divergences form a rich and versatile family of distances that unifies quadratic Euclidean distances with various well- ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Abstract—In this paper, we generalize the notions of centroids (and barycenters) to the broad class of information-theoretic distortion measures called Bregman divergences. Bregman divergences form a rich and versatile family of distances that unifies quadratic Euclidean distances with various well-known statistical entropic measures. Since besides the squared Euclidean distance, Bregman divergences are asymmetric, we consider the left-sided and right-sided centroids and the symmetrized centroids as minimizers of average Bregman distortions. We prove that all three centroids are unique and give closed-form solutions for the sided centroids that are generalized means. Furthermore, we design a provably fast and efficient arbitrary close approximation algorithm for the symmetrized centroid based on its exact geometric characterization. The geometric approximation algorithm requires only to walk on a geodesic linking the two left/right-sided centroids. We report on our implementation for computing entropic centers of image histogram clusters and entropic centers of multivariate normal distributions that are useful operations for processing multimedia information and retrieval. These experiments illustrate that our generic methods compare favorably with former limited ad hoc methods. Index Terms—Bregman divergence, Bregman information, Bregman power divergence, Burbea–Rao divergence, centroid,

