Results 1 
4 of
4
Consistent Nonparametric Tests of Independence
, 2009
"... Three simple and explicit procedures for testing the independence of two multidimensional random variables are described. Two of the associated test statistics (L1, loglikelihood) are defined when the empirical distribution of the variables is restricted to finite partitions. A third test statist ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Three simple and explicit procedures for testing the independence of two multidimensional random variables are described. Two of the associated test statistics (L1, loglikelihood) are defined when the empirical distribution of the variables is restricted to finite partitions. A third test statistic is defined as a kernelbased independence measure. Two kinds of tests are provided. Distributionfree strong consistent tests are derived on the basis of large deviation bounds on the test statistcs: these tests make almost surely no Type I or Type II error after a random sample size. Asymptotically αlevel tests are obtained from the limiting distribution of the test statistics. For the latter tests, the Type I error converges to a fixed nonzero value α, and the Type II error drops to zero, for increasing sample size. All tests reject the null hypothesis of independence if the test statistics become large. The performance of the tests is evaluated experimentally on benchmark data.
On Integral Probability Metrics, φDivergences and Binary Classification
, 2009
"... A class of distance measures on probabilities — the integral probability metrics (IPMs) — is addressed: these include the Wasserstein distance, Dudley metric, and Maximum Mean Discrepancy. IPMs have thus far mostly been used in more abstract settings, for instance as theoretical tools in mass trans ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
A class of distance measures on probabilities — the integral probability metrics (IPMs) — is addressed: these include the Wasserstein distance, Dudley metric, and Maximum Mean Discrepancy. IPMs have thus far mostly been used in more abstract settings, for instance as theoretical tools in mass transportation problems, and in metrizing the weak topology on the set of all Borel probability measures defined on a metric space. Practical applications of IPMs are less common, with some exceptions in the kernel machines literature. The present work contributes a number of novel properties of IPMs, which should contribute to making IPMs more widely used in practice, for instance in areas where φdivergences are currently popular. First, to understand the relation between IPMs and φdivergences, the necessary and sufficient conditions under which these classes intersect are derived: the total variation distance is shown to be the only nontrivial φdivergence that is also an IPM. This shows that IPMs are essentially different from φdivergences. Second, empirical estimates of several IPMs from finite i.i.d. samples are obtained, and their consistency and convergence rates are analyzed. These estimators are shown to be easily computable, with better rates of convergence than estimators of φdivergences. Third, a novel interpretation is provided for IPMs by relating them to binary classification, where it is shown that the IPM between classconditional distributions is the negative of the optimal risk associated with a binary classifier. In addition, the smoothness of an appropriate binary classifier is proved to be inversely related to the distance between the classconditional distributions, measured in terms of an IPM.
DISCUSSION OF: BROWNIAN DISTANCE COVARIANCE
"... 1. Introduction. A dependence statistic, the Brownian Distance Covariance, has been proposed for use in dependence measurement and independence testing: we refer to this contribution henceforth as SR [we also note the earlier work on this topic of Székely, Rizzo and Bakirov (2007)]. Some advantages ..."
Abstract
 Add to MetaCart
1. Introduction. A dependence statistic, the Brownian Distance Covariance, has been proposed for use in dependence measurement and independence testing: we refer to this contribution henceforth as SR [we also note the earlier work on this topic of Székely, Rizzo and Bakirov (2007)]. Some advantages of the authors’ approach are that the random variables X and Y being tested may have arbitrary dimension
MaxPlanckInstitutes Tübingen,
, 2009
"... In this paper, we present two classes of Bayesian approaches to the twosample problem. Our first class of methods extends the Bayesian ttest to include all parametric models in the exponential family and their conjugate priors. Our second class of methods uses Dirichlet process mixtures (DPM) of su ..."
Abstract
 Add to MetaCart
In this paper, we present two classes of Bayesian approaches to the twosample problem. Our first class of methods extends the Bayesian ttest to include all parametric models in the exponential family and their conjugate priors. Our second class of methods uses Dirichlet process mixtures (DPM) of such conjugateexponential distributions as flexible nonparametric priors over the unknown distributions. 1