Results 1 
8 of
8
Measuring statistical dependence with HilbertSchmidt norms
 PROCEEDINGS ALGORITHMIC LEARNING THEORY
, 2005
"... We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the HilbertSchmidt norm of the crosscovariance operator (we term this a HilbertSchmidt Independence Criterion, or HSIC). Th ..."
Abstract

Cited by 158 (44 self)
 Add to MetaCart
We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the HilbertSchmidt norm of the crosscovariance operator (we term this a HilbertSchmidt Independence Criterion, or HSIC). This approach has several advantages, compared with previous kernelbased independence criteria. First, the empirical estimate is simpler than any other kernel dependence test, and requires no userdefined regularisation. Second, there is a clearly defined population quantity which the empirical estimate approaches in the large sample limit, with exponential convergence guaranteed between the two: this ensures that independence tests based on HSIC do not suffer from slow learning rates. Finally, we show in the context of independent component analysis (ICA) that the performance of HSIC is competitive with that of previously published kernelbased criteria, and of other recently published ICA methods.
A Review of Kernel Methods in Machine Learning
, 2006
"... We review recent methods for learning with positive definite kernels. All these methods formulate learning and estimation problems as linear tasks in a reproducing kernel Hilbert space (RKHS) associated with a kernel. We cover a wide range of methods, ranging from simple classifiers to sophisticate ..."
Abstract

Cited by 95 (4 self)
 Add to MetaCart
We review recent methods for learning with positive definite kernels. All these methods formulate learning and estimation problems as linear tasks in a reproducing kernel Hilbert space (RKHS) associated with a kernel. We cover a wide range of methods, ranging from simple classifiers to sophisticated methods for estimation with structured data.
Kernel methods for measuring independence
 Journal of Machine Learning Research
, 2005
"... We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the covariance between functions of the random variables in reproducing kernel Hilbert spaces (RKHSs). We prov ..."
Abstract

Cited by 58 (19 self)
 Add to MetaCart
We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the covariance between functions of the random variables in reproducing kernel Hilbert spaces (RKHSs). We prove that when the RKHSs are universal, both functionals are zero if and only if the random variables are pairwise independent. We also show that the kernel mutual information is an upper bound near independence on the Parzen window estimate of the mutual information. Analogous results apply for two correlationbased dependence functionals introduced earlier: we show the kernel canonical correlation and the kernel generalised variance to be independence measures for universal kernels, and prove the latter to be an upper bound on the mutual information near independence. The performance of the kernel dependence functionals in measuring independence is verified in the context of independent component analysis.
Statistical consistency of kernel canonical correlation analysis
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2007
"... While kernel canonical correlation analysis (CCA) has been applied in many contexts, the convergence of finite sample estimates of the associated functions to their population counterparts has not yet been established. This paper gives a mathematical proof of the statistical convergence of kernel CC ..."
Abstract

Cited by 27 (12 self)
 Add to MetaCart
While kernel canonical correlation analysis (CCA) has been applied in many contexts, the convergence of finite sample estimates of the associated functions to their population counterparts has not yet been established. This paper gives a mathematical proof of the statistical convergence of kernel CCA, providing a theoretical justification for the method. The proof uses covariance operators defined on reproducing kernel Hilbert spaces, and analyzes the convergence of their empirical estimates of finite rank to their population counterparts, which can have infinite rank. The result also gives a sufficient condition for convergence on the regularization coefficient involved in kernel CCA: this should decrease as n −1/3, where n is the number of data.
Article Measures of Causality in Complex Datasets with Application to Financial Data
, 2014
"... entropy ..."
(Show Context)
Abstract Towards scalable and data efficient learning of Markov boundaries
"... We propose algorithms for learning Markov boundaries from data without having to learn a Bayesian network first. We study their correctness, scalability and data efficiency. The last two properties are important because we aim to apply the algorithms to identify the minimal set of features that is n ..."
Abstract
 Add to MetaCart
(Show Context)
We propose algorithms for learning Markov boundaries from data without having to learn a Bayesian network first. We study their correctness, scalability and data efficiency. The last two properties are important because we aim to apply the algorithms to identify the minimal set of features that is needed for probabilistic classification in databases with thousands of features but few instances, e.g. gene expression databases. We evaluate the algorithms on synthetic and real databases, including one with 139,351 features.