Results 1  10
of
37
Nonlinear causal discovery with additive noise models
"... The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuousvalued data linear acyclic causal models with additive noise are often used because these models are well understood and there are wellknown methods to fit them to data. In ..."
Abstract

Cited by 35 (16 self)
 Add to MetaCart
The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuousvalued data linear acyclic causal models with additive noise are often used because these models are well understood and there are wellknown methods to fit them to data. In reality, of course, many causal relationships are more or less nonlinear, raising some doubts as to the applicability and usefulness of purely linear methods. In this contribution we show that in fact the basic linear framework can be generalized to nonlinear models. In this extended framework, nonlinearities in the datagenerating process are in fact a blessing rather than a curse, as they typically provide information on the underlying causal system and allow more aspects of the true datagenerating mechanisms to be identified. In addition to theoretical results we show simulations and some simple real data experiments illustrating the identification power provided by nonlinearities. 1
Temporal Causal Modeling with Graphical Granger Methods
 In Proceedings of the 13th Int. Conference on Knowledge Discovery and Data Mining, 66 – 75: Association for Computing Machinery
, 2007
"... The need for mining causality, beyond mere statistical correlations, for real world problems has been recognized widely. Many of these applications naturally involve temporal data, which raises the challenge of how best to leverage the temporal information for causal modeling. Recently graphical mod ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
The need for mining causality, beyond mere statistical correlations, for real world problems has been recognized widely. Many of these applications naturally involve temporal data, which raises the challenge of how best to leverage the temporal information for causal modeling. Recently graphical modeling with the concept of “Granger causality”, based on the intuition that a cause helps predict its effects in the future, has gained attention in many domains involving time series data analysis. With the surge of interest in model selection methodologies for regression, such as the Lasso, as practical alternatives to solving structural learning of graphical models, the question arises whether and how to combine these two notions into a practically viable approach for temporal causal modeling. In this paper, we examine a host of related
Learning Latent Tree Graphical Models
 J. of Machine Learning Research
, 2011
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using socalled information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighborjoining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We also present regularized versions of our algorithms that learn latent tree approximations of arbitrary distributions. We compare
The hidden life of latent variables: Bayesian learning with mixed graph models
, 2008
"... Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of D ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of DAGs is not closed under marginalization of hidden variables. This means that in general we cannot use a DAG to represent the independencies over a subset of variables in a larger DAG. Directed mixed graphs (DMGs) are a representation that includes DAGs as a special case, and overcomes this limitation. This paper introduces algorithms for performing Bayesian inference in Gaussian and probit DMG models. An important requirement for inference is the characterization of the distribution over parameters of the models. We introduce a new distribution for covariance matrices of Gaussian DMGs. We discuss and illustrate how several Bayesian machine learning tasks can benefit from the principle presented here: the power to model dependencies that are generated from hidden variables, but without necessarily modelling such variables explicitly.
Estimation of causal effects using linear nonGaussian causal models with hidden variables
"... ..."
Bayesian learning of measurement and structural models
 23rd International Conference on Machine Learning
, 2006
"... We present a Bayesian search algorithm for learning the structure of latent variable models of continuous variables. We stress the importance of applying search operators designed especially for the parametric family used in our models. This is performed by searching for subsets of the observed vari ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We present a Bayesian search algorithm for learning the structure of latent variable models of continuous variables. We stress the importance of applying search operators designed especially for the parametric family used in our models. This is performed by searching for subsets of the observed variables whose covariance matrix can be represented as a sum of a matrix of low rank and a diagonal matrix of residuals. The resulting search procedure is relatively efficient, since the main search operator has a branch factor that grows linearly with the number of variables. The resulting models are often simpler and give a better fit than models based on generalizations of factor analysis or those derived from standard hillclimbing methods. 1.
Search for additive nonlinear time series causal models
 JMLR
, 2008
"... Pointwise consistent, feasible procedures for estimating contemporaneous linear causal structure from time series data have been developed using multiple conditional independence tests, but no such procedures are available for nonlinear systems. We describe a feasible procedure for learning a class ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Pointwise consistent, feasible procedures for estimating contemporaneous linear causal structure from time series data have been developed using multiple conditional independence tests, but no such procedures are available for nonlinear systems. We describe a feasible procedure for learning a class of nonlinear time series structures, which we call additive nonlinear time series. We show that for data generated from stationary models of this type, two classes of conditional independence relations among time series variables and their lags can be tested efficiently and consistently using tests based on additive model regression. Combining results of statistical tests for these two classes of conditional independence relations and the temporal structure of time series data, a new consistent model specification procedure is able to extract relatively detailed causal information. We investigate the finite sample behavior of the procedure through simulation, and illustrate the application of this method through analysis of the possible causal connections among four ocean indices. Several variants of the procedure are also discussed.
Joint estimation of linear nongaussian acyclic models
 Neurocomputing
, 2012
"... for latent factors ..."
Identifying confounders using additive noise models
, 2009
"... We propose a method for inferring the existence of a latent common cause (“confounder”) of two observed random variables. The method assumes that the two effects of the confounder are (possibly nonlinear) functions of the confounder plus independent, additive noise. We discuss under which conditions ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We propose a method for inferring the existence of a latent common cause (“confounder”) of two observed random variables. The method assumes that the two effects of the confounder are (possibly nonlinear) functions of the confounder plus independent, additive noise. We discuss under which conditions the model is identifiable (up to an arbitrary reparameterization of the confounder) from the joint distribution of the effects. We state and prove a theoretical result that provides evidence for the conjecture that the model is generically identifiable under suitable technical conditions. In addition, we propose a practical method to estimate the confounder from a finite i.i.d. sample of the effects and illustrate that the method works well on both simulated and realworld data.
Automated search for causal relations: Theory and practice
 Heuristics, Probability and Causality: A Tribute to Judea Pearl
, 2010
"... The rapid spread of interest in the last two decades in principled methods of search or estimation of causal relations has been driven in part by technological developments, especially the changing nature of modern data collection and storage techniques, and the increases in the speed and storage ca ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The rapid spread of interest in the last two decades in principled methods of search or estimation of causal relations has been driven in part by technological developments, especially the changing nature of modern data collection and storage techniques, and the increases in the speed and storage capacities of computers. Statistics books from 30 years