Results 1  10
of
44
A theory of causal learning in children: Causal maps and Bayes nets
 PSYCHOLOGICAL REVIEW
, 2004
"... The authors outline a cognitive and computational account of causal learning in children. They propose that children use specialized cognitive systems that allow them to recover an accurate “causal map ” of the world: an abstract, coherent, learned representation of the causal relations among events ..."
Abstract

Cited by 160 (33 self)
 Add to MetaCart
The authors outline a cognitive and computational account of causal learning in children. They propose that children use specialized cognitive systems that allow them to recover an accurate “causal map ” of the world: an abstract, coherent, learned representation of the causal relations among events. This kind of knowledge can be perspicuously understood in terms of the formalism of directed graphical causal models, or Bayes nets. Children’s causal learning and inference may involve computations similar to those for learning causal Bayes nets and for predicting with them. Experimental results suggest that 2to 4yearold children construct new causal maps and that their learning is consistent with the Bayes net formalism.
Learning Bayesian Networks from Data: An InformationTheory Based Approach
"... This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract

Cited by 93 (5 self)
 Add to MetaCart
This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
A Bayesian Approach to Causal Discovery
, 1997
"... We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraintbased approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that t ..."
Abstract

Cited by 80 (1 self)
 Add to MetaCart
We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraintbased approach. Both approaches rely on the Causal Markov assumption, but the two differ significantly in theory and practice. An important difference between the approaches is that the constraintbased approach uses categorical information about conditionalindependence constraints in the domain, whereas the Bayesian approach weighs the degree to which such constraints hold. As a result, the Bayesian approach has three distinct advantages over its constraintbased counterpart. One, conclusions derived from the Bayesian approach are not susceptible to incorrect categorical decisions about independence facts that can occur with data sets of finite size. Two, using the Bayesian approach, finer distinctions among model structuresboth quantitative and qualitativecan be made. Three, information from several models can be combined to make better inferences and to better ...
ANCESTRAL GRAPH MARKOV MODELS
, 2002
"... This paper introduces a class of graphical independence models that is closed under marginalization and conditioning but that contains all DAG independence models. This class of graphs, called maximal ancestral graphs, has two attractive features: there is at most one edge between each pair of verti ..."
Abstract

Cited by 75 (17 self)
 Add to MetaCart
This paper introduces a class of graphical independence models that is closed under marginalization and conditioning but that contains all DAG independence models. This class of graphs, called maximal ancestral graphs, has two attractive features: there is at most one edge between each pair of vertices; every missing edge corresponds to an independence relation. These features lead to a simple parameterization of the corresponding set of distributions in the Gaussian case.
Causal Inference from Graphical Models
, 2001
"... Introduction The introduction of Bayesian networks (Pearl 1986b) and associated local computation algorithms (Lauritzen and Spiegelhalter 1988, Shenoy and Shafer 1990, Jensen, Lauritzen and Olesen 1990) has initiated a renewed interest for understanding causal concepts in connection with modelling ..."
Abstract

Cited by 56 (4 self)
 Add to MetaCart
Introduction The introduction of Bayesian networks (Pearl 1986b) and associated local computation algorithms (Lauritzen and Spiegelhalter 1988, Shenoy and Shafer 1990, Jensen, Lauritzen and Olesen 1990) has initiated a renewed interest for understanding causal concepts in connection with modelling complex stochastic systems. It has become clear that graphical models, in particular those based upon directed acyclic graphs, have natural causal interpretations and thus form a base for a language in which causal concepts can be discussed and analysed in precise terms. As a consequence there has been an explosion of writings, not primarily within mainstream statistical literature, concerned with the exploitation of this language to clarify and extend causal concepts. Among these we mention in particular books by Spirtes, Glymour and Scheines (1993), Shafer (1996), and Pearl (2000) as well as the collection of papers in Glymour and Cooper (1999). Very briefly, but fundamentally,
Estimating highdimensional directed acyclic graphs with the PCalgorithm
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2007
"... We consider the PCalgorithm (Spirtes et al., 2000) for estimating the skeleton and equivalence class of a very highdimensional directed acyclic graph (DAG) with corresponding Gaussian distribution. The PCalgorithm is computationally feasible and often very fast for sparse problems with many nodes ..."
Abstract

Cited by 48 (4 self)
 Add to MetaCart
We consider the PCalgorithm (Spirtes et al., 2000) for estimating the skeleton and equivalence class of a very highdimensional directed acyclic graph (DAG) with corresponding Gaussian distribution. The PCalgorithm is computationally feasible and often very fast for sparse problems with many nodes (variables), and it has the attractive property to automatically achieve high computational efficiency as a function of sparseness of the true underlying DAG. We prove uniform consistency of the algorithm for very highdimensional, sparse DAGs where the number of nodes is allowed to quickly grow with sample size n, as fast as O(n a) for any 0 < a < ∞. The sparseness assumption is rather minimal requiring only that the neighborhoods in the DAG are of lower order than sample size n. We also demonstrate the PCalgorithm for simulated data.
A simple constraintbased algorithm for efficiently mining observational databases for causal relationships
 Data Mining and Knowledge Discovery
, 1997
"... Abstract. This paper presents a simple, efficient computerbased method for discovering causal relationships from databases that contain observational data. Observational data is passively observed, as contrasted with experimental data. Most of the databases available for data mining are observation ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
Abstract. This paper presents a simple, efficient computerbased method for discovering causal relationships from databases that contain observational data. Observational data is passively observed, as contrasted with experimental data. Most of the databases available for data mining are observational. There is great potential for mining such databases to discover causal relationships. We illustrate how observational data can constrain the causal relationships among measured variables, sometimes to the point that we can conclude that one variable is causing another variable. The presentation here is based on a constraintbased approach to causal discovery. A primary purpose of this paper is to present the constraintbased causal discovery method in the simplest possible fashion in order to (1) readily convey the basic ideas that underlie more complex constraintbased causal discovery techniques, and (2) permit interested readers to rapidly program and apply the method to their own databases, as a start toward using more elaborate causal discovery algorithms.
Causal Inference in the Presence of Latent Variables and Selection Bias
 In Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence
"... This paper uses Bayesian network models for that investigation. Bayesian networks, or directed acyclic graph (DAG) models have proved very useful in representing both causal and statistical hypotheses. The nodes of the graph represent vertices, directed edges represent direct influences, and the top ..."
Abstract

Cited by 28 (4 self)
 Add to MetaCart
This paper uses Bayesian network models for that investigation. Bayesian networks, or directed acyclic graph (DAG) models have proved very useful in representing both causal and statistical hypotheses. The nodes of the graph represent vertices, directed edges represent direct influences, and the topology of the graph encodes statistical constraints. We will consider features of such models that can be determined from data under assumptions that are related to those routinely applied in experimental situations:
On Chain Graph Models For Description Of Conditional Independence Structures
 Ann. Statist
, 1998
"... This paper deals with chain graphs (CGs) which allow both directed and undirected edges. This class of graphs, introduced by Lauritzen and Wermuth [15], generalizes both UGs and DAGs. To establish the semantics of CGs one should associate an independency model to every CG. Some steps were already ma ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
This paper deals with chain graphs (CGs) which allow both directed and undirected edges. This class of graphs, introduced by Lauritzen and Wermuth [15], generalizes both UGs and DAGs. To establish the semantics of CGs one should associate an independency model to every CG. Some steps were already made. Lauritzen and Wermuth [16] intended to use CGs to describe independency models for strictly positive probability distributions and introduced the concept of the chain Markov property which is analogous to the concept of causal input list for DAGs. Lauritzen and Frydenberg [17, 9] generalized the concept of moral graph and introduced a moralization criterion for reading independency statements from a CG. Frydenberg [9] characterized CGs with the same Markov ON CHAIN GRAPH MODELS 3 property (that is producing the same CGmodel) and Andersson, Madigan and Perlman [3] used special CGs to represent uniquely classes of Markov equivalent DAGs. Whittaker [31] in his book gave several examples of the use of CGs, and other recent works also deal with them [6, 20, 23, 30], the most comprehensive account is provided by the book [19]. Several results proved here were already presented (without proof) in our previous conference contribution [5]. An alternative approach to the generalization of UGs and DAGs was started by Cox and Wermuth [7] who introduced a wider class of jointresponse chain graphs which allow also 'dashed' directed and undirected edges in addition to the classic 'solid' directed and undirected edges treated in this paper. Andersson, Madigan and Perlman [1] introduced an alternative Markov property to give an interpretation to those jointresponse CGs which combine dashed directed edges with solid undirected edges (of course, another independency model is associated...
Causal inference using the algorithmic Markov condition
, 2008
"... Inferring the causal structure that links n observables is usually basedupon detecting statistical dependences and choosing simple graphs that make the joint measure Markovian. Here we argue why causal inference is also possible when only single observations are present. We develop a theory how to g ..."
Abstract

Cited by 12 (12 self)
 Add to MetaCart
Inferring the causal structure that links n observables is usually basedupon detecting statistical dependences and choosing simple graphs that make the joint measure Markovian. Here we argue why causal inference is also possible when only single observations are present. We develop a theory how to generate causal graphs explaining similarities between single objects. To this end, we replace the notion of conditional stochastic independence in the causal Markov condition with the vanishing of conditional algorithmic mutual information anddescribe the corresponding causal inference rules. We explain why a consistent reformulation of causal inference in terms of algorithmic complexity implies a new inference principle that takes into account also the complexity of conditional probability densities, making it possible to select among Markov equivalent causal graphs. This insight provides a theoretical foundation of a heuristic principle proposed in earlier work. We also discuss how to replace Kolmogorov complexity with decidable complexity criteria. This can be seen as an algorithmic analog of replacing the empirically undecidable question of statistical independence with practical independence tests that are based on implicit or explicit assumptions on the underlying distribution. email: