Results 11 - 20
of
176
Learning Belief Networks in the Presence of Missing Values and Hidden Variables
- Proceedings of the Fourteenth International Conference on Machine Learning
, 1997
"... In recent years there has been a flurry of works on learning probabilistic belief networks. Current state of the art methods have been shown to be successful for two learning scenarios: learning both network structure and parameters from complete data, and learning parameters for a fixed network fr ..."
Abstract
-
Cited by 107 (14 self)
- Add to MetaCart
In recent years there has been a flurry of works on learning probabilistic belief networks. Current state of the art methods have been shown to be successful for two learning scenarios: learning both network structure and parameters from complete data, and learning parameters for a fixed network from incomplete data---that is, in the presence of missing values or hidden variables. However, no method has yet been demonstrated to effectively learn network structure from incomplete data. In this paper, we propose a new method for learning network structure from incomplete data. This method is based on an extension of the Expectation-Maximization (EM) algorithm for model selection problems that performs search for the best structure inside the EM procedure. We prove the convergence of this algorithm, and adapt it for learning belief networks. We then describe how to learn networks in two scenarios: when the data contains missing values, and in the presence of hidden variables. We provide...
Bayesian Models for Keyhole Plan Recognition in an Adventure Game
, 1998
"... We present an approach to keyhole plan recognition which uses a dynamic belief (Bayesian) network to represent features of the domain that are needed to identify users' plans and goals. The application domain is a Multi-User Dungeon adventure game with thousands of possible actions and locations. W ..."
Abstract
-
Cited by 99 (10 self)
- Add to MetaCart
We present an approach to keyhole plan recognition which uses a dynamic belief (Bayesian) network to represent features of the domain that are needed to identify users' plans and goals. The application domain is a Multi-User Dungeon adventure game with thousands of possible actions and locations. We propose several network structures which represent the relations in the domain to varying extents, and compare their predictive power for predicting a user's current goal, next action and next location. The conditional probability distributions for each network are learned during a training phase, which dynamically builds these probabilities from observations of user behaviour. This approach allows the use of incomplete, sparse and noisy data during both training and testing. We then apply simple abstraction and learning techniques in order to speed up the performance of the most promising dynamic belief networks without a significant change in the accuracy of goal predictions. Our experi...
Answering Queries from Context-Sensitive Probabilistic Knowledge Bases
- Theoretical Computer Science
, 1996
"... We define a language for representing context-sensitive probabilistic knowledge. A knowledge base consists of a set of universally quantified probability sentences that include context constraints, which allow inference to be focused on only the relevant portions of the probabilistic knowledge. We p ..."
Abstract
-
Cited by 86 (0 self)
- Add to MetaCart
We define a language for representing context-sensitive probabilistic knowledge. A knowledge base consists of a set of universally quantified probability sentences that include context constraints, which allow inference to be focused on only the relevant portions of the probabilistic knowledge. We provide a declarative semantics for our language. We present a query answering procedure which takes a query Q and a set of evidence E and constructs a Bayesian network to compute P (QjE). The posterior probability is then computed using any of a number of Bayesian network inference algorithms. We use the declarative semantics to prove the query procedure sound and complete. We use concepts from logic programming to justify our approach. Keywords: reasoning under uncertainty, Bayesian networks, Probability model construction, logic programming Submitted to Theoretical Computer Science special issue on Uncertainty in Databases and Deductive Systems. This work was partially supported by NSF g...
Building Classifiers using Bayesian Networks
- In Proceedings of the thirteenth national conference on artificial intelligence
, 1996
"... Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state of the art classifiers such as C4.5. This fact raises the question of whether a classifier with less restr ..."
Abstract
-
Cited by 72 (2 self)
- Add to MetaCart
Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state of the art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we examine and evaluate approaches for inducing classifiers from data, based on recent results in the theory of learning Bayesian networks. Bayesian networks are factored representations of probability distributions that generalize the naive Bayes classifier and explicitly represent statements about independence. Among these approacheswe single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness which are characteristic of naive Bayes. We experimentally tested these approaches using benchmark problems from...
Learning Bayesian Networks from Data: An Information-Theory Based Approach
"... This paper provides algorithms that use an information-theoretic analysis to learn Bayesian network structures from data. Based on our three-phase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract
-
Cited by 67 (4 self)
- Add to MetaCart
This paper provides algorithms that use an information-theoretic analysis to learn Bayesian network structures from data. Based on our three-phase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
Extracting Comprehensible Models from Trained Neural Networks
, 1996
"... To Mom, Dad, and Susan, for their support and encouragement. ..."
Abstract
-
Cited by 65 (4 self)
- Add to MetaCart
To Mom, Dad, and Susan, for their support and encouragement.
Comparing Bayesian Network Classifiers
, 1999
"... In this paper, we empirically evaluate algorithms for learning four types of Bayesian network (BN) classifiers -- Naïve-Bayes, tree augmented Naïve-Bayes, BN augmented Naïve-Bayes and general BNs, where the latter two are learned using two variants of a conditional-independence (CI) based BNlearnin ..."
Abstract
-
Cited by 55 (4 self)
- Add to MetaCart
In this paper, we empirically evaluate algorithms for learning four types of Bayesian network (BN) classifiers -- Naïve-Bayes, tree augmented Naïve-Bayes, BN augmented Naïve-Bayes and general BNs, where the latter two are learned using two variants of a conditional-independence (CI) based BNlearning algorithm. Experimental results show the obtained classifiers, learned using the CI based algorithms, are competitive with (or superior to) the best known classifiers, based on both Bayesian networks and other formalisms; and that the computational time for learning and using these classifiers is relatively small. Moreover, these results also suggest a way to learn yet more effective classifiers; we demonstrate empirically that this new algorithm does work as expected. Collectively, these results argue that BN classifiers deserve more attention in machine learning and data mining communities. 1 INTRODUCTION Many tasks -- including fault diagnosis, pattern recognition and forecasting -- c...
Discretizing Continuous Attributes While Learning Bayesian Networks
- In Proc. ICML
, 1996
"... We introduce a method for learning Bayesian networks that handles the discretization of continuous variables as an integral part of the learning process. The main ingredient in this method is a new metric based on the Minimal Description Length principle for choosing the threshold values for the dis ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
We introduce a method for learning Bayesian networks that handles the discretization of continuous variables as an integral part of the learning process. The main ingredient in this method is a new metric based on the Minimal Description Length principle for choosing the threshold values for the discretization while learning the Bayesian network structure. This score balances the complexity of the learned discretization and the learned network structure against how well they model the training data. This ensures that the discretization of each variable introduces just enough intervals to capture its interaction with adjacent variables in the network. We formally derive the new metric, study its main properties, and propose an iterative algorithm for learning a discretization policy. Finally, we illustrate its behavior in applications to supervised learning. 1 INTRODUCTION Bayesian networks provide efficient and effective representation of the joint probability distribution over a set ...
Learning Bayesian Belief Network Classifiers: Algorithms and System
- Proceedings of 14 th Biennial conference of the
, 2001
"... This paper investigates the methods for learning predictive classifiers based on Bayesian belief networks (BN) -- primarily unrestricted Bayesian networks and Bayesian multinets. We present our algorithms for learning these classifiers, and discuss how these methods address the overfitting proble ..."
Abstract
-
Cited by 45 (3 self)
- Add to MetaCart
This paper investigates the methods for learning predictive classifiers based on Bayesian belief networks (BN) -- primarily unrestricted Bayesian networks and Bayesian multinets. We present our algorithms for learning these classifiers, and discuss how these methods address the overfitting problem and provide a natural method for feature subset selection. Using a set of standard classification problems, we empirically evaluate the performance of various BN-based classifiers. The results show that the proposed BN and Bayes multi-net classifiers are competitive with (or superior to) the best known classifiers, based on both BN and other formalisms; and that the computational time for learning and using these classifiers is relatively small. These results argue that BN based classifiers deserve more attention in the data mining community. 1 In t roduct i on Many tasks -- including fault diagnosis, pattern recognition and forecasting -- can be viewed as classification, as each r...
On the Sample Complexity of Learning Bayesian Networks
, 1996
"... In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: w ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: with probability one, it will converge to the target distribution, given a sufficient number of samples. However, the rate of this convergence has been hitherto unknown. In this work we examine the sample complexity of MDL based learning procedures for Bayesian networks. We show that the number of samples needed to learn an ffl-close approximation (in terms of entropy distance) with confidence ffi is O i ( 1 ffl ) 4 3 log 1 ffl log 1 ffi log log 1 ffi j . This means that the sample complexity is a low-order polynomial in the error threshold and sublinear in the confidence bound. We also discuss how the constants in this term depend on the complexity of the target distribution. F...

