Results 1 
9 of
9
Learning Bayesian Networks from Data: An InformationTheory Based Approach
, 2001
"... This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract

Cited by 126 (4 self)
 Add to MetaCart
This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
Learning Bayesian Networks from Data: An Efficient Approach Based on Information Theory
, 1997
"... This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the ..."
Abstract

Cited by 49 (0 self)
 Add to MetaCart
This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the node ordering is given, the algorithm only require ) ( 2 N O CI tests and is correct given that the underlying model is DAGFaithful [Spirtes et. al., 1996]. The other algorithm deals with the general case and requires ) ( 4 N O conditional independence (CI) tests. It is correct given that the underlying model is monotone DAGFaithful (see Section 4.4). A system based on these algorithms has been developed and distributed through the Internet. The empirical results show that our approach is efficient and reliable. 1 Introduction The Bayesian network is a powerful knowledge representation and reasoning tool under conditions of uncertainty. A Bayesian network is a directed acyclic graph ...
Using Path Diagrams as a Structural Equation Modelling Tool
, 1997
"... this paper, we will show how path diagrams can be used to solve a number of important problems in structural equation modelling. There are a number of problems associated with structural equation modeling. These problems include: ..."
Abstract

Cited by 36 (8 self)
 Add to MetaCart
this paper, we will show how path diagrams can be used to solve a number of important problems in structural equation modelling. There are a number of problems associated with structural equation modeling. These problems include:
Statistical Themes and Lessons for Data Mining
, 1997
"... Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statist ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statistical themes and lessons that are directly relevant to data mining and attempts to identify opportunities where close cooperation between the statistical and computational communities might reasonably provide synergy for further progress in data analysis.
The hidden life of latent variables: Bayesian learning with mixed graph models
, 2008
"... Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of D ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of DAGs is not closed under marginalization of hidden variables. This means that in general we cannot use a DAG to represent the independencies over a subset of variables in a larger DAG. Directed mixed graphs (DMGs) are a representation that includes DAGs as a special case, and overcomes this limitation. This paper introduces algorithms for performing Bayesian inference in Gaussian and probit DMG models. An important requirement for inference is the characterization of the distribution over parameters of the models. We introduce a new distribution for covariance matrices of Gaussian DMGs. We discuss and illustrate how several Bayesian machine learning tasks can benefit from the principle presented here: the power to model dependencies that are generated from hidden variables, but without necessarily modelling such variables explicitly.
Causal Inference and Reasoning in Causally Insufficient Systems
, 2006
"... The big question that motivates this dissertation is the following: under what conditions and to what extent can passive observations inform us of the structure of causal connections among a set of variables and of the potential outcome of an active intervention on some of the variables? The partic ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
The big question that motivates this dissertation is the following: under what conditions and to what extent can passive observations inform us of the structure of causal connections among a set of variables and of the potential outcome of an active intervention on some of the variables? The particular concern here revolves around the common kind of situations where the variables of interest, though measurable themselves, may suffer from confounding due to unobserved common causes. Relying on a graphical representation of causally insufficient systems called maximal ancestral graphs, and two wellknown principles widely discussed in the literature, the causal Markov and Faithfulness conditions, we show that the FCI algorithm, a sound inference procedure in the literature for inferring features of the unknown causal structure from facts of probabilistic independence and dependence, is, with some extra sound inference rules, also complete in the sense that any feature of the causal structure left undecided by the inference procedure is indeed underdetermined by facts of probabilistic independence and dependence. In addition, we consider the issue of quantitative reasoning about effects of local interventions with the FCIlearnable features of the unknown causal structure. We improve and generalize two important pieces of work in the literature about identifying intervention effects. We also provide some preliminary study of the testability of the
P.: A transformational characterization of markov equivalence for directed acyclic graphs with latent variables
 In: Proc. of the 21st Conference on Uncertainty in Artificial Intelligence (UAI
, 2005
"... The conditional independence relations present in a data set usually admit multiple causal explanations — typically represented by directed graphs — which are Markov equivalent in that they entail the same conditional independence relations among the observed variables. Markov equivalence between di ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
The conditional independence relations present in a data set usually admit multiple causal explanations — typically represented by directed graphs — which are Markov equivalent in that they entail the same conditional independence relations among the observed variables. Markov equivalence between directed acyclic graphs (DAGs) has been characterized in various ways, each of which has been found useful for certain purposes. In particular, Chickering’s transformational characterization is useful in deriving properties shared by Markov equivalent DAGs, and, with certain generalization, is needed to justify a search procedure over Markov equivalence classes, known as the GES algorithm. Markov equivalence between DAGs with latent variables has also been characterized, in the spirit of Verma and Pearl (1990), via maximal ancestral graphs (MAGs). The latter can represent the observable conditional independence relations as well as some causal features of DAG models with latent variables. However, no characterization of Markov equivalent MAGs is yet available that is analogous to the transformational characterization for Markov equivalent DAGs. The main contribution of the current paper is to establish such a characterization for directed MAGs, which we expect will have similar uses as Chickering’s characterization does for DAGs. 1
c ○ 1997 Kluwer Academic Publishers. Manufactured in The Netherlands. Statistical Themes and Lessons for Data Mining
, 1996
"... Abstract. Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Data mining is on the interface of Computer Science and Statistics, utilizing advances in both disciplines to make progress in extracting information from large databases. It is an emerging field that has attracted much attention in a very short period of time. This article highlights some statistical themes and lessons that are directly relevant to data mining and attempts to identify opportunities where close cooperation between the statistical and computational communities might reasonably provide synergy for further progress in data analysis.
THE GEOMETRY OF HIDDEN TREE MARKOV MODELS FOR BINARY DATA
"... Abstract. In this paper we investigate the geometry of a discrete Bayesian network whose graph is a tree all of whose variables are binary and the only observed variables are those labeling its leaves. We obtain a full geometric description of these models which is given by polynomial equations and ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. In this paper we investigate the geometry of a discrete Bayesian network whose graph is a tree all of whose variables are binary and the only observed variables are those labeling its leaves. We obtain a full geometric description of these models which is given by polynomial equations and inequalities. Our analysis is based on combinatorial results generalizing the notion of cumulants so that they apply to the models under analysis. The geometric structure we obtain links to the notion of a tree metric considered in phylogenetic analysis and to some interesting determinantal formulas involving the hyperdeterminant of 2 × 2 × 2 tables. 1.