Results 1 - 10
of
33
A Bayesian method for the induction of probabilistic networks from data
- Machine Learning
, 1992
"... Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computer-assisted hypothesis testing, automated scientific discovery, and automated construction of ..."
Abstract
-
Cited by 877 (24 self)
- Add to MetaCart
Abstract. This paper presents a Bayesian method for constructing probabilistic networks from databases. In particular, we focus on constructing Bayesian belief networks. Potential applications include computer-assisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. We extend the basic method to handle missing data and hidden (latent) variables. We show how to perform probabilistic inference by averaging over the inferences of multiple belief networks. Results are presented of a preliminary evaluation of an algorithm for constructing a belief network from a database of cases. Finally, we relate the methods in this paper to previous work, and we discuss open problems.
Word-Sense Disambiguation Using Decomposable Models
- In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics
, 1994
"... Most probabilistic classifiers used for word-sense disambiguation have either been based on only one contextual feature or have used a model that is simply assumed to characterize the interdependencies among multiple contextual features. In this paper, a different approach to formulating a probabili ..."
Abstract
-
Cited by 124 (17 self)
- Add to MetaCart
Most probabilistic classifiers used for word-sense disambiguation have either been based on only one contextual feature or have used a model that is simply assumed to characterize the interdependencies among multiple contextual features. In this paper, a different approach to formulating a probabilistic model is presented along with a case study of the performance of models produced in this manner for the disambiguafion of the noun interest. We describe a method for formulating probabilistic models that use multiple contextual features for word-sense disambiguafion, without requiring untested assumptions regarding the form of the model. Using this approach, the joint distribution of all variables is described by only the most systematic variable interactions, thereby limiting the number of parameters to be estimated, supporting computational efficiency, and providing an understanding of the data.
Learning Bayesian Networks from Data: An Information-Theory Based Approach
"... This paper provides algorithms that use an information-theoretic analysis to learn Bayesian network structures from data. Based on our three-phase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract
-
Cited by 67 (4 self)
- Add to MetaCart
This paper provides algorithms that use an information-theoretic analysis to learn Bayesian network structures from data. Based on our three-phase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
Dynamic Belief Networks for Discrete Monitoring
- IEEE Transactions on Systems, Man, and Cybernetics
, 1994
"... We describe the development of a monitoring system which uses sensor observation data about discrete events to construct dynamically a probabilistic model of the world. This model is a Bayesian network incorporating temporal aspects, which we call a Dynamic Belief Network; it is used to reason under ..."
Abstract
-
Cited by 50 (7 self)
- Add to MetaCart
We describe the development of a monitoring system which uses sensor observation data about discrete events to construct dynamically a probabilistic model of the world. This model is a Bayesian network incorporating temporal aspects, which we call a Dynamic Belief Network; it is used to reason under uncertainty about both the causes and consequences of the events being monitored. The basic dynamic construction of the network is data-driven. However the model construction process combines sensor data about events with externally provided information about agents' behaviour, and knowledge already contained within the model, to control the size and complexity of the network. This means that both the network structure within a time interval, and the amount of history and detail maintained, can vary over time. We illustrate the system with the example domain of monitoring robot vehicles and people in a restricted dynamic environment using light-beam sensor data. In addition to presenting a ...
Learning Belief Networks from Data: An Information Theory Based Approach
- In Proceedings of the Sixth ACM International Conference on Information and Knowledge Management
"... This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data ..."
Abstract
-
Cited by 48 (7 self)
- Add to MetaCart
This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data set that is large enough, this algorithm can generate a belief network very close to the underlying model, and at the same time, enjoys the time complexity of O N ( ) 4 on conditional independence (CI) tests. When the data set has a normal DAG-Faithful (see Section 3.2) probability distribution, the algorithm guarantees that the structure of a perfect map [Pearl, 1988] of the underlying dependency model is generated. To evaluate this algorithm, we present the experimental results on three versions of the wellknown ALARM network database, which has 37 attributes and 10,000 records. The results show that this algorithm is accurate and efficient. The proof of correctness and the analysis of c...
An Algorithm for Bayesian Belief Network Construction from Data
- IN PROCEEDINGS OF AI & STAT’97
, 1997
"... This paper presents an efficient algorithm for constructing Bayesian belief networks from databases. The algorithm takes a database and an attributes ordering (i.e., the causal attributes of an attribute should appear earlier in the order) as input and constructs a belief network structure as output ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
This paper presents an efficient algorithm for constructing Bayesian belief networks from databases. The algorithm takes a database and an attributes ordering (i.e., the causal attributes of an attribute should appear earlier in the order) as input and constructs a belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data set which is large enough and has a DAGIsomorphic probability distribution, this algorithm guarantees that the perfect map [1] of the underlying dependency model is generated, and at the same time, enjoys the time complexity of O N ( ) on conditional independence (CI) tests. To evaluate this algorithm, we present the experimental results on three versions of the well-known ALARM network database, which has 37 attributes and 10,000 records. The correctness proof and the analysis of computational complexity are also presented. We also discuss the features of our work and relate it to previous works.
Maximum likelihood bounded tree-width markov networks
- Artificial Intelligence
, 2001
"... We study the problem of projecting a distribution onto (or finding a maximum likelihood distribution among) Markov networks of bounded tree-width. By casting it as the combinatorial optimization problem of finding a maximum weight hypertree, we prove that it is NP-hard to solve exactly and provide a ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
We study the problem of projecting a distribution onto (or finding a maximum likelihood distribution among) Markov networks of bounded tree-width. By casting it as the combinatorial optimization problem of finding a maximum weight hypertree, we prove that it is NP-hard to solve exactly and provide an approximation algorithm with a provable performance guarantee.
Learning Bayesian Networks from Data: An Efficient Approach Based on Information Theory
, 1997
"... This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our three-phase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our three-phase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the node ordering is given, the algorithm only require ) ( 2 N O CI tests and is correct given that the underlying model is DAG-Faithful [Spirtes et. al., 1996]. The other algorithm deals with the general case and requires ) ( 4 N O conditional independence (CI) tests. It is correct given that the underlying model is monotone DAG-Faithful (see Section 4.4). A system based on these algorithms has been developed and distributed through the Internet. The empirical results show that our approach is efficient and reliable. 1 Introduction The Bayesian network is a powerful knowledge representation and reasoning tool under conditions of uncertainty. A Bayesian network is a directed acyclic graph ...
Probabilistic Network Construction Using the Minimum Description Length Principle
, 1994
"... Probabilistic networks can be constructed from a database of cases by selecting a network that has highest quality with respect to this database according to a given measure. A new measure is presented for this purpose based on a minimum description length (MDL) approach. This measure is compared wi ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
Probabilistic networks can be constructed from a database of cases by selecting a network that has highest quality with respect to this database according to a given measure. A new measure is presented for this purpose based on a minimum description length (MDL) approach. This measure is compared with a commonly used measure based on a Bayesian approach both from a theoretical and an experimental point of view. We show that the two measures have the same properties for infinite large databases. For smaller databases, however, the MDL measure assigns equal quality to networks that represent the same set of independencies while the Bayesian measure does not. Preliminary test results suggest that an algorithm for learning probabilistic networks using the minimum description length approach performs comparably to a learning algorithm using the Bayesian approach. However, the former is slightly faster.
On the Markov Equivalence of Chain Graphs, Undirected Graphs, and Acyclic Digraphs
- Scandinavian Journal of Statistics
, 1994
"... Graphical Markov models use undirected graphs (UDGs), acyclic directed graphs (ADGs), or (mixed) chain graphs to represent possible dependencies among random variables in a multivariate distribution. Whereas a UDG is uniquely determined by its associated Markov model, this is not true for ADGs or fo ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
Graphical Markov models use undirected graphs (UDGs), acyclic directed graphs (ADGs), or (mixed) chain graphs to represent possible dependencies among random variables in a multivariate distribution. Whereas a UDG is uniquely determined by its associated Markov model, this is not true for ADGs or for general chain graphs (which include both UDGs and ADGs as special cases). This paper addresses three questions regarding the equivalence of graphical Markov models: when is a given chain graph Markov equivalent (1) to some UDG? (2) to some (at least one) ADG? (3) to some decomposable UDG? The answers are obtained by means of an extension of Frydenberg's (1990) elegant graph-theoretic characterization of the Markov equivalence of chain graphs. 1 Introduction The use of graphs to represent dependence relations among random variables, first introduced by Wright (1921), has generated considerable research activity, especially since the early 1980s. Particular attention has been devoted to gra...

