Results 1  10
of
78
Regression by dependence minimization and its application to causal inference in additive noise models
, 2009
"... ..."
Identifiability of causal graphs using functional models
 In UAI
, 2011
"... This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach taken by conditional independencebased causal discovery methods is based on two assumptions: the Markov condition and faithfulnes ..."
Abstract

Cited by 22 (7 self)
 Add to MetaCart
(Show Context)
This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach taken by conditional independencebased causal discovery methods is based on two assumptions: the Markov condition and faithfulness. It has been shown that under these assumptions the causal graph can be identified up to Markov equivalence (some arrows remain undirected) using methods like the PC algorithm. In this work we propose an alternative by defining Identifiable Functional Model Classes (IFMOCs). As our main theorem we prove that if the data generating process belongs to an IFMOC, one can identify the complete causal graph. To the best of our knowledge this is the first identifiability result of this kind that is not limited to linear functional relationships. We discuss how the IFMOC assumption and the Markov and faithfulness assumptions relate to each other and explain why we believe that the IFMOC assumption can be tested more easily on given data. We further provide a practical algorithm that recovers the causal graph from finitely many data; experiments on simulated data support the theoretical findings. 1
DirectLiNGAM: A direct method for learning a linear nongaussian structural equation model
 J. of Machine Learning Research
"... ..."
(Show Context)
Identifiability of Gaussian structural equation models with same error variances. Available at arXiv:1205.2536
, 2012
"... We consider structural equation models in which variables can be written as a function of their parents and noise terms, which are assumed to be jointly independent. Corresponding to each structural equation model, there is a directed acyclic graph describing the relationships between the variables ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We consider structural equation models in which variables can be written as a function of their parents and noise terms, which are assumed to be jointly independent. Corresponding to each structural equation model, there is a directed acyclic graph describing the relationships between the variables. In Gaussian structural equation models with linear functions, the graph can be identified from the joint distribution only up to Markov equivalence classes, assuming faithfulness. In this work, we prove full identifiability if all noise variables have the same variances: the directed acyclic graph can be recovered from the joint Gaussian distribution. Our result has direct implications for causal inference: if the data follow a Gaussian structural equation model with equal error variances and assuming that all variables are observed, the causal structure can be inferred from observational data only. We propose a statistical method and an algorithm that exploit our theoretical findings. 1
The hidden life of latent variables: Bayesian learning with mixed graph models
, 2008
"... Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of D ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of DAGs is not closed under marginalization of hidden variables. This means that in general we cannot use a DAG to represent the independencies over a subset of variables in a larger DAG. Directed mixed graphs (DMGs) are a representation that includes DAGs as a special case, and overcomes this limitation. This paper introduces algorithms for performing Bayesian inference in Gaussian and probit DMG models. An important requirement for inference is the characterization of the distribution over parameters of the models. We introduce a new distribution for covariance matrices of Gaussian DMGs. We discuss and illustrate how several Bayesian machine learning tasks can benefit from the principle presented here: the power to model dependencies that are generated from hidden variables, but without necessarily modelling such variables explicitly.
Nonlinear directed acyclic structure learning with weakly additive noise models
, 2009
"... ..."
(Show Context)
Dependence minimizing regression with model selection for nonlinear causal inference under nonGaussian noise
 Proceedings of the TwentyThird AAAI Conference on Artificial Intelligence (AAAI2010
, 2010
"... The discovery of nonlinear causal relationship under additive nonGaussian noise models has attracted considerable attention recently because of their high flexibility. In this paper, we propose a novel causal inference algorithm called leastsquares independence regression (LSIR). LSIR learns the ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
(Show Context)
The discovery of nonlinear causal relationship under additive nonGaussian noise models has attracted considerable attention recently because of their high flexibility. In this paper, we propose a novel causal inference algorithm called leastsquares independence regression (LSIR). LSIR learns the additive noise model through minimization of an estimator of the squaredloss mutual information between inputs and residuals. A notable advantage of LSIR over existing approaches is that tuning parameters such as the kernel width and the regularization parameter can be naturally optimized by crossvalidation, allowing us to avoid overfitting in a datadependent fashion. Through experiments with realworld datasets, we show that LSIR compares favorably with the stateoftheart causal inference method.
Causal Inference on Discrete Data using Additive Noise Models
"... Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. The case of two random variables is particularly challenging since no (conditional) independences can be exploited. Recent methods that are based on additiv ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. The case of two random variables is particularly challenging since no (conditional) independences can be exploited. Recent methods that are based on additive noise models suggest the following principle: Whenever the joint distribution P (X,Y) admits such a model in one direction, e.g. Y = f(X)+N, N ⊥ X, but does not admit the reversed model X = g(Y)+ Ñ, Ñ ⊥ Y, one infers the former direction to be causal (i.e. X → Y). Up to now these approaches only deal with continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. We prove that it almost never occurs that additive noise models can be fit in both directions. We further propose an efficient algorithm that is able to perform this way of causal inference on finite samples of discrete variables. We show that the algorithm works both on synthetic and real data sets.
Distinguishing between cause and effect
, 2008
"... We describe eight data sets that together formed the CauseEffectPairs task in the Causality Challenge #2: PotLuck competition. Each set consists of a sample of a pair of statistically dependent random variables. One variable is known to cause the other one, but this information was hidden from the ..."
Abstract

Cited by 10 (8 self)
 Add to MetaCart
(Show Context)
We describe eight data sets that together formed the CauseEffectPairs task in the Causality Challenge #2: PotLuck competition. Each set consists of a sample of a pair of statistically dependent random variables. One variable is known to cause the other one, but this information was hidden from the participants; the task was to identify which of the two variables was the cause and which one the effect, based upon the observed sample. The data sets were chosen such that we expect common agreement on the ground truth. Even though part of the statistical dependences may also be due to hidden common causes, common sense tells us that there is a significant causeeffect relation between the two variables in each pair. We also present baseline results using three different causal inference methods.
Independent Component Analysis: Recent Advances
"... Independent component analysis is a probabilistic method for learning a linear transform of a random vector. The goal is to find components which are maximally independent and nonGaussian (nonnormal). Its fundamental difference to classical multivariate statistical methods is in the assumption of ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
Independent component analysis is a probabilistic method for learning a linear transform of a random vector. The goal is to find components which are maximally independent and nonGaussian (nonnormal). Its fundamental difference to classical multivariate statistical methods is in the assumption of nonGaussianity, which enables the identification of original, underlying components, in contrast to classical methods. The basic theory of ICA was mainly developed in the 1990’s and summarized, for example, in our monograph in 2001. Here, we provide an overview of some recent developments in the theory since the year 2000. The main topics are: analysis of causal relations, testing independent components, analysing multiple data sets (threeway data), modelling dependencies between the components, and improved methods for estimating the basic model. Key words: independent component analysis, blind source separation, nonGaussianity, causal analysis. 1.