Results 1  10
of
14
Learning Bayesian networks: The combination of knowledge and statistical data
 Machine Learning
, 1995
"... We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simpl ..."
Abstract

Cited by 913 (38 self)
 Add to MetaCart
We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simplify the encoding of a user’s prior knowledge. In particular, a user can express his knowledge—for the most part—as a single prior Bayesian network for the domain. 1
A Guide to the Literature on Learning Probabilistic Networks From Data
, 1996
"... This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the ..."
Abstract

Cited by 172 (0 self)
 Add to MetaCart
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The presentation avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples. Keywords Bayesian networks, graphical models, hidden variables, learning, learning structure, probabilistic networks, knowledge discovery. I. Introduction Probabilistic networks or probabilistic gra...
Scalable Techniques for Mining Causal Structures
 Data Mining and Knowledge Discovery
, 1998
"... Mining for association rules in market basket data has proved a fruitful area of research. Measures such as conditional probability (confidence) and correlation have been used to infer rules of the form "the existence of item A implies the existence of item B." However, such rules indicate only a st ..."
Abstract

Cited by 88 (1 self)
 Add to MetaCart
Mining for association rules in market basket data has proved a fruitful area of research. Measures such as conditional probability (confidence) and correlation have been used to infer rules of the form "the existence of item A implies the existence of item B." However, such rules indicate only a statistical relationship between A and B. They do not specify the nature of the relationship: whether the presence of A causes the presence of B, or the converse, or some other attribute or phenomenon causes both to appear together. In applications, knowing such causal relationships is extremely useful for enhancing understanding and effecting change. While distinguishing causality from correlation is a truly difficult problem, recent work in statistics and Bayesian learning provide some avenues of attack. In these fields, the goal has generally been to learn complete causal models, which are essentially impossible to learn in largescale data mining applications with a large number of variab...
A Bayesian approach to learning causal networks
 In Uncertainty in AI: Proceedings of the Eleventh Conference
, 1995
"... Whereas acausal Bayesian networks represent probabilistic independence, causal Bayesian networks represent causal relationships. In this paper, we examine Bayesian methods for learning both types of networks. Bayesian methods for learning acausal networks are fairly well developed. These methods oft ..."
Abstract

Cited by 58 (11 self)
 Add to MetaCart
Whereas acausal Bayesian networks represent probabilistic independence, causal Bayesian networks represent causal relationships. In this paper, we examine Bayesian methods for learning both types of networks. Bayesian methods for learning acausal networks are fairly well developed. These methods often employ assumptions to facilitate the construction of priors, including the assumptions of parameter independence, parameter modularity, and likelihood equivalence. We show that although these assumptions also can be appropriate for learning causal networks, we need additional assumptions in order to learn causal networks. We introduce two sufficient assumptions, called mechanism independence and component independence. We show that these new assumptions, when combined with parameter independence, parameter modularity, and likelihood equivalence, allow us to apply methods for learning acausal networks to learn causal networks. 1
Axioms of Causal Relevance
 Artificial Intelligence
, 1996
"... This paper develops axioms and formal semantics for statements of the form "X is causally irrelevant to Y in context Z," which we interpret to mean "Changing X will not affect Y if we hold Z constant." The axiomization of causal irrelevance is contrasted with the axiomization of informational irr ..."
Abstract

Cited by 54 (15 self)
 Add to MetaCart
This paper develops axioms and formal semantics for statements of the form "X is causally irrelevant to Y in context Z," which we interpret to mean "Changing X will not affect Y if we hold Z constant." The axiomization of causal irrelevance is contrasted with the axiomization of informational irrelevance, as in "Learning X will not alter our belief in Y , once we know Z." Two versions of causal irrelevance are analyzed, probabilistic and deterministic. We show that, unless stability is assumed, the probabilistic definition yields a very loose structure, that is governed by just two trivial axioms. Under the stability assumption, probabilistic causal irrelevance is isomorphic to path interception in cyclic graphs. Under the deterministic definition, causal irrelevance complies with all of the axioms of path interception in cyclic graphs, with the exception of transitivity. We compare our formalism to that of [Lewis, 1973], and offer a graphical method of proving theorems abou...
An Axiomatic Characterization of Causal Counterfactuals
, 1998
"... This paper studies the causal interpretation of counterfactual sentences using a modifiable structural equation model. It is shown that two properties of counterfactuals, namely, composition and effectiveness, are sound and complete relative to this interpretation, when recursive (i.e., feedback ..."
Abstract

Cited by 47 (19 self)
 Add to MetaCart
This paper studies the causal interpretation of counterfactual sentences using a modifiable structural equation model. It is shown that two properties of counterfactuals, namely, composition and effectiveness, are sound and complete relative to this interpretation, when recursive (i.e., feedbackless) models are considered. Composition and effectiveness also hold in Lewis's closestworld semantics, which implies that for recursive models the causal interpretation imposes no restrictions beyond those embodied in Lewis's framework. A third property, called reversibility, holds in nonrecursive causal models but not in Lewis's closestworld semantics, which implies that Lewis's axioms do not capture some properties of systems with feedback. Causal inferences based on counterfactual analysis are exemplified and compared to those based on graphical models.
Learning Causal Networks from Data: A survey and a new algorithm for recovering possibilistic causal networks
, 1997
"... Introduction Reasoning in terms of cause and effect is a strategy that arises in many tasks. For example, diagnosis is usually defined as the task of finding the causes (illnesses) from the observed effects (symptoms). Similarly, prediction can be understood as the description of a future plausible ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
Introduction Reasoning in terms of cause and effect is a strategy that arises in many tasks. For example, diagnosis is usually defined as the task of finding the causes (illnesses) from the observed effects (symptoms). Similarly, prediction can be understood as the description of a future plausible situation where observed effects will be in accordance with the known causal structure of the phenomenon being studied. Causal models are a summary of the knowledge about a phenomenon expressed in terms of causation. Many areas of the ap # This work has been partially supported by the Spanish Comission Interministerial de Ciencia y Tecnologia Project CICYTTIC96 0878. plied sciences (econometry, biomedics, engineering, etc.) have used such a term to refer to models that yield explanations, allow for prediction and facilitate planning and decision making. Causal reasoning can be viewed as inference guided by a causation theory. That kind of inference can be further specialised into induc
Identifying Independencies in Causal Graphs with Feedback
 In Uncertainty in Artificial Intelligence: Proceedings of the Twelfth Conference
, 1996
"... We show that the dseparation criterion constitutes a valid test for conditional independence relationships that are induced by feedback systems involving discrete variables. 1 INTRODUCTION It is well known that the dseparation test is sound and complete relative to the independencies assumed in t ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
We show that the dseparation criterion constitutes a valid test for conditional independence relationships that are induced by feedback systems involving discrete variables. 1 INTRODUCTION It is well known that the dseparation test is sound and complete relative to the independencies assumed in the construction of Bayesian networks [Verma and Pearl, 1988, Geiger et al., 1990]. In other words, any dseparation condition in the network corresponds to a genuine independence condition in the underlying probability distribution and, conversely, every dconnection corresponds to a dependency in at least one distribution compatible with the network. The situation with feedback systems is more complicated, primarily because the probability distributions associated with such systems do not lend themselves to a simple product decomposition. The joint distribution of feedback systems cannot be written as a product of the conditional distributions of each child variable, given its parents. Rath...
On the Definition of Actual Cause
, 1998
"... This report is based on lecture notes written for CS 262C, Spring 1998, and is organized as follows. Following a review of the SL framework (Section 2) Section 3 provides a comparison to other approaches to causation and suggests an explanation of why the notion of actual cause has encountered diffi ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
This report is based on lecture notes written for CS 262C, Spring 1998, and is organized as follows. Following a review of the SL framework (Section 2) Section 3 provides a comparison to other approaches to causation and suggests an explanation of why the notion of actual cause has encountered difficulties in those approaches. Section 3 defines "actual cause" and illustrates, through examples, how the "probability that event X = x actually caused event