Results 1  10
of
40
A Bayesian method for the induction of probabilistic networks from data
 Machine Learning
, 1992
"... Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of ..."
Abstract

Cited by 1081 (27 self)
 Add to MetaCart
Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computerassisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. We extend the basic method to handle missing data and hidden (latent) variables. We show how to perform probabilistic inference by averaging over the inferences of multiple belief networks. Results are presented of a preliminary evaluation of an algorithm for constructing a belief network from a database of cases. Finally, we relate the methods in this paper to previous work, and we discuss open problems.
WordSense Disambiguation Using Decomposable Models
 In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics
, 1994
"... Most probabilistic classifiers used for wordsense disambiguation have either been based on only one contextual feature or have used a model that is simply assumed to characterize the interdependencies among multiple contextual features. In this paper, a different approach to formulating a probabili ..."
Abstract

Cited by 138 (19 self)
 Add to MetaCart
Most probabilistic classifiers used for wordsense disambiguation have either been based on only one contextual feature or have used a model that is simply assumed to characterize the interdependencies among multiple contextual features. In this paper, a different approach to formulating a probabilistic model is presented along with a case study of the performance of models produced in this manner for the disambiguafion of the noun interest. We describe a method for formulating probabilistic models that use multiple contextual features for wordsense disambiguafion, without requiring untested assumptions regarding the form of the model. Using this approach, the joint distribution of all variables is described by only the most systematic variable interactions, thereby limiting the number of parameters to be estimated, supporting computational efficiency, and providing an understanding of the data.
Causality: Models
 Reasoning, and Inference
, 2000
"... This paper explores the role of Directed Acyclic Graphs (DAGs) as a representation of conditional independence relationships. We show that DAGs offer polynomially sound and complete inference mechanisms for inferring conditional independence relationships from a given causal set of such relationship ..."
Abstract

Cited by 103 (15 self)
 Add to MetaCart
This paper explores the role of Directed Acyclic Graphs (DAGs) as a representation of conditional independence relationships. We show that DAGs offer polynomially sound and complete inference mechanisms for inferring conditional independence relationships from a given causal set of such relationships. As a consequence, dseparation, a graphical criterion for identifying independencies in a DAG, is shown to uncover more valid independencies then any other criterion. In addition, we employ the Armstrong property of conditional independence to show that the dependence relationships displayed by a DAG are inherently consistent, i.e. for every DAG D there exists some probability distribution P that embodies all the conditional independencies displayed in D and none other. INTRODUCTION AND SUMMARY OF RESULTS Networks employing Directed Acyclic Graphs (DAGs) have a long and rich tradition, starting with the geneticist Wright (1921). He developed a method called path analysis [Wright, 1934] which later on, became an established representation of causal models in economics [Wold, 1964], sociology [Blalock, 1971] and psychology [Duncan, 1975]. Influence diagrams represent another application of
Learning Bayesian Networks from Data: An InformationTheory Based Approach
"... This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract

Cited by 93 (5 self)
 Add to MetaCart
This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
Learning Belief Networks from Data: An Information Theory Based Approach
 In Proceedings of the Sixth ACM International Conference on Information and Knowledge Management
"... This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data ..."
Abstract

Cited by 65 (7 self)
 Add to MetaCart
This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data set that is large enough, this algorithm can generate a belief network very close to the underlying model, and at the same time, enjoys the time complexity of O N ( ) 4 on conditional independence (CI) tests. When the data set has a normal DAGFaithful (see Section 3.2) probability distribution, the algorithm guarantees that the structure of a perfect map [Pearl, 1988] of the underlying dependency model is generated. To evaluate this algorithm, we present the experimental results on three versions of the wellknown ALARM network database, which has 37 attributes and 10,000 records. The results show that this algorithm is accurate and efficient. The proof of correctness and the analysis of c...
Dynamic Belief Networks for Discrete Monitoring
 IEEE Transactions on Systems, Man, and Cybernetics
, 1994
"... We describe the development of a monitoring system which uses sensor observation data about discrete events to construct dynamically a probabilistic model of the world. This model is a Bayesian network incorporating temporal aspects, which we call a Dynamic Belief Network; it is used to reason under ..."
Abstract

Cited by 54 (7 self)
 Add to MetaCart
We describe the development of a monitoring system which uses sensor observation data about discrete events to construct dynamically a probabilistic model of the world. This model is a Bayesian network incorporating temporal aspects, which we call a Dynamic Belief Network; it is used to reason under uncertainty about both the causes and consequences of the events being monitored. The basic dynamic construction of the network is datadriven. However the model construction process combines sensor data about events with externally provided information about agents' behaviour, and knowledge already contained within the model, to control the size and complexity of the network. This means that both the network structure within a time interval, and the amount of history and detail maintained, can vary over time. We illustrate the system with the example domain of monitoring robot vehicles and people in a restricted dynamic environment using lightbeam sensor data. In addition to presenting a ...
An Algorithm for Bayesian Belief Network Construction from Data
 IN PROCEEDINGS OF AI & STATâ€™97
, 1997
"... This paper presents an efficient algorithm for constructing Bayesian belief networks from databases. The algorithm takes a database and an attributes ordering (i.e., the causal attributes of an attribute should appear earlier in the order) as input and constructs a belief network structure as output ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
This paper presents an efficient algorithm for constructing Bayesian belief networks from databases. The algorithm takes a database and an attributes ordering (i.e., the causal attributes of an attribute should appear earlier in the order) as input and constructs a belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data set which is large enough and has a DAGIsomorphic probability distribution, this algorithm guarantees that the perfect map [1] of the underlying dependency model is generated, and at the same time, enjoys the time complexity of O N ( ) on conditional independence (CI) tests. To evaluate this algorithm, we present the experimental results on three versions of the wellknown ALARM network database, which has 37 attributes and 10,000 records. The correctness proof and the analysis of computational complexity are also presented. We also discuss the features of our work and relate it to previous works.
Maximum likelihood bounded treewidth markov networks
 Artificial Intelligence
, 2001
"... We study the problem of projecting a distribution onto (or finding a maximum likelihood distribution among) Markov networks of bounded treewidth. By casting it as the combinatorial optimization problem of finding a maximum weight hypertree, we prove that it is NPhard to solve exactly and provide a ..."
Abstract

Cited by 43 (4 self)
 Add to MetaCart
We study the problem of projecting a distribution onto (or finding a maximum likelihood distribution among) Markov networks of bounded treewidth. By casting it as the combinatorial optimization problem of finding a maximum weight hypertree, we prove that it is NPhard to solve exactly and provide an approximation algorithm with a provable performance guarantee.
Learning Bayesian Networks from Data: An Efficient Approach Based on Information Theory
, 1997
"... This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the node ordering is given, the algorithm only require ) ( 2 N O CI tests and is correct given that the underlying model is DAGFaithful [Spirtes et. al., 1996]. The other algorithm deals with the general case and requires ) ( 4 N O conditional independence (CI) tests. It is correct given that the underlying model is monotone DAGFaithful (see Section 4.4). A system based on these algorithms has been developed and distributed through the Internet. The empirical results show that our approach is efficient and reliable. 1 Introduction The Bayesian network is a powerful knowledge representation and reasoning tool under conditions of uncertainty. A Bayesian network is a directed acyclic graph ...
On the Markov Equivalence of Chain Graphs, Undirected Graphs, and Acyclic Digraphs
 Scandinavian Journal of Statistics
, 1994
"... Graphical Markov models use undirected graphs (UDGs), acyclic directed graphs (ADGs), or (mixed) chain graphs to represent possible dependencies among random variables in a multivariate distribution. Whereas a UDG is uniquely determined by its associated Markov model, this is not true for ADGs or fo ..."
Abstract

Cited by 30 (5 self)
 Add to MetaCart
Graphical Markov models use undirected graphs (UDGs), acyclic directed graphs (ADGs), or (mixed) chain graphs to represent possible dependencies among random variables in a multivariate distribution. Whereas a UDG is uniquely determined by its associated Markov model, this is not true for ADGs or for general chain graphs (which include both UDGs and ADGs as special cases). This paper addresses three questions regarding the equivalence of graphical Markov models: when is a given chain graph Markov equivalent (1) to some UDG? (2) to some (at least one) ADG? (3) to some decomposable UDG? The answers are obtained by means of an extension of Frydenberg's (1990) elegant graphtheoretic characterization of the Markov equivalence of chain graphs. 1 Introduction The use of graphs to represent dependence relations among random variables, first introduced by Wright (1921), has generated considerable research activity, especially since the early 1980s. Particular attention has been devoted to gra...