Results 1  10
of
33
Nonparametric Divergence Estimation with Applications to Machine Learning on Distributions
"... Lowdimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. The existing methods usually consider the case when each instance has a fixed, finitedimensional feature representation. Here we consider a diff ..."
Abstract

Cited by 24 (12 self)
 Add to MetaCart
Lowdimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. The existing methods usually consider the case when each instance has a fixed, finitedimensional feature representation. Here we consider a different setting. We assume that each instance corresponds to a continuous probability distribution. These distributions are unknown, but we are given some i.i.d. samples from each distribution. Our goal is to estimate the distances between these distributions and use these distances to perform lowdimensional embedding, clustering/classification, or anomaly detection for the distributions. We present estimation algorithms, describe how to apply them for machine learning tasks on distributions, and show empirical results on synthetic data, real word images, and astronomical data sets. 1
Copulabased kernel dependency measures
 In ICML
, 2012
"... The paper presents a new copula based method for measuring dependence between random variables. Our approach extends the Maximum Mean Discrepancy to the copula of the joint distribution. We prove that this approach has several advantageous properties. Similarly to Shannon mutual information, the p ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
The paper presents a new copula based method for measuring dependence between random variables. Our approach extends the Maximum Mean Discrepancy to the copula of the joint distribution. We prove that this approach has several advantageous properties. Similarly to Shannon mutual information, the proposed dependence measure is invariant to any strictly increasing transformation of the marginal variables. This is important in many applications, for example in feature selection. The estimator is consistent, robust to outliers, and uses rank statistics only. We derive upper bounds on the convergence rate and propose independence tests too. We illustrate the theoretical contributions through a series of experiments in feature selection and lowdimensional embedding of distributions. 1.
On the Estimation of αDivergences
"... We propose new nonparametric, consistent Rényiα and Tsallisα divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naïve ” approach would be to simply estimate the underlying densities and plug the estimated densities into the correspondi ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
We propose new nonparametric, consistent Rényiα and Tsallisα divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naïve ” approach would be to simply estimate the underlying densities and plug the estimated densities into the corresponding formulas. Our proposed estimators, in contrast, avoid density estimation completely, estimating the divergences directly using only simple knearestneighbor statistics. We are nonetheless able to prove that the estimators are consistent under certain conditions. We also describe how to apply these estimators to mutual information and demonstrate their efficiency via numerical experiments. 1
Nonparametric Estimation of Conditional Information and Divergences
"... In this paper we propose new nonparametric estimators for a family of conditional mutual information and divergences. Our estimators are easy to compute; they only use simple k nearest neighbor based statistics. We prove that the proposed conditional information and divergence estimators are consist ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
In this paper we propose new nonparametric estimators for a family of conditional mutual information and divergences. Our estimators are easy to compute; they only use simple k nearest neighbor based statistics. We prove that the proposed conditional information and divergence estimators are consistent under certain conditions, and demonstrate their consistency and applicability by numerical experiments on simulated and on real data as well. 1
Nonparametric estimation of renyi divergence and friends.
 In ICML,
, 2014
"... Abstract We consider nonparametric estimation of L 2 , Rényiα and Tsallisα divergences between continuous distributions. Our approach is to construct estimators for particular integral functionals of two densities and translate them into divergence estimators. For the integral functionals, our es ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
Abstract We consider nonparametric estimation of L 2 , Rényiα and Tsallisα divergences between continuous distributions. Our approach is to construct estimators for particular integral functionals of two densities and translate them into divergence estimators. For the integral functionals, our estimators are based on corrections of a preliminary plugin estimator. We show that these estimators achieve the parametric convergence rate of n −1/2 when the densities' smoothness, s, are both at least d/4 where d is the dimension. We also derive minimax lower bounds for this problem which confirm that s > d/4 is necessary to achieve the n −1/2 rate of convergence. We validate our theoretical guarantees with a number of simulations.
Generalized exponential concentration inequality for Rényi divergence estimation
 in International Conference on Machine Learning
, 2014
"... Estimating divergences in a consistent way is of great importance in many machine learning tasks. Although this is a fundamental problem in nonparametric statistics, to the best of our knowledge there has been no finite sample exponential inequality convergence bound derived for any divergence esti ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Estimating divergences in a consistent way is of great importance in many machine learning tasks. Although this is a fundamental problem in nonparametric statistics, to the best of our knowledge there has been no finite sample exponential inequality convergence bound derived for any divergence estimators. The main contribution of our work is to provide such a bound for an estimator of Rényiα divergence for a smooth Hölder class of densities on the ddimensional unit cube [0, 1]d. We also illustrate our theoretical results with a numerical experiment. 1.
knearest neighbor estimation of entropies with confidence
, 2011
"... We analyze a knearest neighbor (kNN) class of plugin estimators for estimating Shannon entropy and Rényi entropy. Based on the statistical properties of kNN balls, we derive explicit rates for the bias and variance of these plugin estimators in terms of the sample size, the dimension of the sam ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
We analyze a knearest neighbor (kNN) class of plugin estimators for estimating Shannon entropy and Rényi entropy. Based on the statistical properties of kNN balls, we derive explicit rates for the bias and variance of these plugin estimators in terms of the sample size, the dimension of the samples and the underlying probability distribution. In addition, we establish a central limit theorem for the plugin estimator that allows us to specify confidence intervals on the entropy functionals. As an application, we use our theory in anomaly detection problems to specify thresholds for achieving desired false alarm rates.
Exponential concentration for mutual information estimation with application to forests
 In Advances in Neural Information Processing Systems
, 2012
"... We prove a new exponential concentration inequality for a plugin estimator of the Shannon mutual information. Previous results on mutual information estimation only bounded expected error. The advantage of having the exponential inequality is that, combined with the union bound, we can guarantee ac ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
We prove a new exponential concentration inequality for a plugin estimator of the Shannon mutual information. Previous results on mutual information estimation only bounded expected error. The advantage of having the exponential inequality is that, combined with the union bound, we can guarantee accurate estimators of the mutual information for many pairs of random variables simultaneously. As an application, we show how to use such a result to optimally estimate the density function and graph of a distribution which is Markov to a forest graph. 1
Copulas in Machine Learning
"... Abstract Despite overlapping goals of multivariate modeling and dependence identification, until recently the fields of machine learning in general and probabilistic graphical models in particular have been ignorant of the framework of copulas. At the same time, complementing strengths of the two fi ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract Despite overlapping goals of multivariate modeling and dependence identification, until recently the fields of machine learning in general and probabilistic graphical models in particular have been ignorant of the framework of copulas. At the same time, complementing strengths of the two fields suggest the great fruitfulness of a synergy. The purpose of this paper is to survey recent copulabased constructions in the field of machine learning, so as to provide a stepping stone for those interested in further exploring this emerging symbiotic research. 1
Estimation of nonlinear functionals of densities with confidence
, 2012
"... This paper introduces a class of knearest neighbor (kNN) estimators called bipartite plugin (BPI) estimators for estimating integrals of nonlinear functions of a probability density, such as Shannon entropy and Rényi entropy. The density is assumed to be smooth, have bounded support, and be unif ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
This paper introduces a class of knearest neighbor (kNN) estimators called bipartite plugin (BPI) estimators for estimating integrals of nonlinear functions of a probability density, such as Shannon entropy and Rényi entropy. The density is assumed to be smooth, have bounded support, and be uniformly bounded from below on this set. Unlike previous kNN estimators of nonlinear density functionals, the proposed estimator uses datasplitting and boundary correction to achieve lower mean square error. Specifically, we assume that T i.i.d. samples Xi ∈ R d from the density are split into two pieces of cardinality M and N respectively, with M samples used for computing a knearestneighbor density estimate and the remaining N samples used for empirical estimation of the integral of the density functional. By studying the statistical properties of kNN balls, explicit rates for the bias and variance of the BPI estimator are derived in terms of the sample size, the dimension of the samples and the underlying probability distribution. Based on these results, it is possible to specify optimal choice of tuning parameters M/T, k for maximizing the rate of decrease of the mean square error (MSE). The resultant optimized BPI estimator converges faster and achieves lower mean squared error than previous kNN entropy estimators. In addition, a central limit theorem is established for the BPI estimator that allows us to specify tight asymptotic confidence intervals.