Results 1  10
of
51
Learning Bayesian Networks from Data: An InformationTheory Based Approach
, 2001
"... This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract

Cited by 101 (4 self)
 Add to MetaCart
This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
Comparing Bayesian Network Classifiers
, 1999
"... In this paper, we empirically evaluate algorithms for learning four types of Bayesian network (BN) classifiers  NaïveBayes, tree augmented NaïveBayes, BN augmented NaïveBayes and general BNs, where the latter two are learned using two variants of a conditionalindependence (CI) based BNlearnin ..."
Abstract

Cited by 83 (5 self)
 Add to MetaCart
In this paper, we empirically evaluate algorithms for learning four types of Bayesian network (BN) classifiers  NaïveBayes, tree augmented NaïveBayes, BN augmented NaïveBayes and general BNs, where the latter two are learned using two variants of a conditionalindependence (CI) based BNlearning algorithm. Experimental results show the obtained classifiers, learned using the CI based algorithms, are competitive with (or superior to) the best known classifiers, based on both Bayesian networks and other formalisms; and that the computational time for learning and using these classifiers is relatively small. Moreover, these results also suggest a way to learn yet more effective classifiers; we demonstrate empirically that this new algorithm does work as expected. Collectively, these results argue that BN classifiers deserve more attention in machine learning and data mining communities. 1 INTRODUCTION Many tasks  including fault diagnosis, pattern recognition and forecasting  c...
Learning Belief Networks from Data: An Information Theory Based Approach
 In Proceedings of the Sixth ACM International Conference on Information and Knowledge Management
"... This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data ..."
Abstract

Cited by 66 (6 self)
 Add to MetaCart
This paper presents an efficient algorithm for learning Bayesian belief networks from databases. The algorithm takes a database as input and constructs the belief network structure as output. The construction process is based on the computation of mutual information of attribute pairs. Given a data set that is large enough, this algorithm can generate a belief network very close to the underlying model, and at the same time, enjoys the time complexity of O N ( ) 4 on conditional independence (CI) tests. When the data set has a normal DAGFaithful (see Section 3.2) probability distribution, the algorithm guarantees that the structure of a perfect map [Pearl, 1988] of the underlying dependency model is generated. To evaluate this algorithm, we present the experimental results on three versions of the wellknown ALARM network database, which has 37 attributes and 10,000 records. The results show that this algorithm is accurate and efficient. The proof of correctness and the analysis of c...
Learning Bayesian Belief Network Classifiers: Algorithms and System
 Proceedings of 14 th Biennial conference of the
, 2001
"... This paper investigates the methods for learning predictive classifiers based on Bayesian belief networks (BN)  primarily unrestricted Bayesian networks and Bayesian multinets. We present our algorithms for learning these classifiers, and discuss how these methods address the overfitting proble ..."
Abstract

Cited by 60 (3 self)
 Add to MetaCart
This paper investigates the methods for learning predictive classifiers based on Bayesian belief networks (BN)  primarily unrestricted Bayesian networks and Bayesian multinets. We present our algorithms for learning these classifiers, and discuss how these methods address the overfitting problem and provide a natural method for feature subset selection. Using a set of standard classification problems, we empirically evaluate the performance of various BNbased classifiers. The results show that the proposed BN and Bayes multinet classifiers are competitive with (or superior to) the best known classifiers, based on both BN and other formalisms; and that the computational time for learning and using these classifiers is relatively small. These results argue that BN based classifiers deserve more attention in the data mining community. 1 In t roduct i on Many tasks  including fault diagnosis, pattern recognition and forecasting  can be viewed as classification, as each r...
Learning Bayesian Networks from Data: An Efficient Approach Based on Information Theory
, 1997
"... This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the ..."
Abstract

Cited by 39 (0 self)
 Add to MetaCart
This paper addresses the problem of learning Bayesian network structures from data by using an information theoretic dependency analysis approach. Based on our threephase construction mechanism, two efficient algorithms have been developed. One of our algorithms deals with a special case where the node ordering is given, the algorithm only require ) ( 2 N O CI tests and is correct given that the underlying model is DAGFaithful [Spirtes et. al., 1996]. The other algorithm deals with the general case and requires ) ( 4 N O conditional independence (CI) tests. It is correct given that the underlying model is monotone DAGFaithful (see Section 4.4). A system based on these algorithms has been developed and distributed through the Internet. The empirical results show that our approach is efficient and reliable. 1 Introduction The Bayesian network is a powerful knowledge representation and reasoning tool under conditions of uncertainty. A Bayesian network is a directed acyclic graph ...
Feature selection and transduction for prediction of molecular bioactivity for drug design
 Bioinformatics
"... Motivation: In drug discovery a key task is to identify characteristics that separate active (binding) compounds from inactive (nonbinding) ones. An automated prediction system can help reduce resources necessary to carry out this task. Results: Twomethods for prediction of molecular bioactivity fo ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
Motivation: In drug discovery a key task is to identify characteristics that separate active (binding) compounds from inactive (nonbinding) ones. An automated prediction system can help reduce resources necessary to carry out this task. Results: Twomethods for prediction of molecular bioactivity for drug design are introduced and shown to perform well in a data set previously studied as part of the KDD (Knowledge Discovery and Data Mining) Cup 2001. The data is characterized by very few positive examples, a very large number of features (describing threedimensional properties of the molecules) and rather different distributions between training and test data. Two techniques are introduced specifically to tackle these problems: a feature selection method for unbalanced data and a classifier which adapts to the distribution of the the unlabeled test data (a socalled transductive method). We show both techniques improve identification performance and in conjunction provide an improvement over using only one of the techniques. Our results suggest the importance of taking into account the characteristics in this data which may also be relevant in other problems of a similar type. Availability: Matlab source code is available at
A new approach for learning belief networks using independence criteria
 International Journal of Approximate Reasoning
, 2000
"... q ..."
Searching for Bayesian Network Structures in the Space of Restricted Acyclic Aprtially Directed Graphs
 Journal of Artificial Intelligence Research
, 2003
"... Although many algorithms have been designed to construct Bayesian network structures using dierent approaches and principles, they all employ only two methods: those based on independence criteria, and those based on a scoring function and a search procedure (although some methods combine the two). ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Although many algorithms have been designed to construct Bayesian network structures using dierent approaches and principles, they all employ only two methods: those based on independence criteria, and those based on a scoring function and a search procedure (although some methods combine the two). Within the score+search paradigm, the dominant approach uses local search methods in the space of directed acyclic graphs (DAGs), where the usual choices for de ning the elementary modi cations (local changes) that can be applied are arc addition, arc deletion, and arc reversal. In this paper, we propose a new local search method that uses a dierent search space, and which takes account of the concept of equivalence between network structures: restricted acyclic partially directed graphs (RPDAGs). In this way, the number of dierent con gurations of the search space is reduced, thus improving eciency. Moreover, although the nal result must necessarily be a local optimum given the nature of the search method, the topology of the new search space, which avoids making early decisions about the directions of the arcs, may help to nd better local optima than those obtained by searching in the DAG space.
Bayesian network classifiers for identifying the slope of the customer lifecycle of longlife customers
, 2004
"... ..."
Bayesian Belief Networks for Data Mining
 University of Magdeburg
, 1996
"... In this paper we present a novel constraint based structural learning algorithm for causal networks. A set of conditional independence and dependence statements (CIDS) is derived from the data which describes the relationships among the variables. Although we implicitly assume that there exist ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
In this paper we present a novel constraint based structural learning algorithm for causal networks. A set of conditional independence and dependence statements (CIDS) is derived from the data which describes the relationships among the variables. Although we implicitly assume that there exists a perfect map for the true, yet unknown, distribution, there does not need to be a perfect map for the CIDSs derived from the limited data. The reason is that the distribution of limited data might differ from the true probability distribution due to sampling noise. We derive a necessary condition for the existence of a perfect map given a set of CIDSs and utilize it to check for inconsistencies. If an inconsistency is detected, the algorithm finds all Bayesian networks with a minimum number of edges such that a maximum number of CIDSs is represented in each of the multiple solutions. The advantages of our approach are illustrated using the alarm network data set. 1