Results 1  10
of
24
Learning recursive Bayesian multinets for data clustering by means of constructive induction
, 2001
"... This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with co ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with component BNs at the leaves. A RBMN is learnt using a greedy, heuristic approach akin to that used by many supervised decision tree learners, but where BNs are learnt at leaves using constructive induction. A key idea is to treat expected data as real data. This allows us to complete the database and to take advantage of a closed form for the marginal likelihood of the expected complete data that factorizes into separate marginal likelihoods for each family (a node and its parents). Our approach is evaluated on synthetic and realworld databases.
Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison
 In: Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics
, 1999
"... This paper compares three methods  em algorithm, Gibbs sampling, and Bound and Collapse (bc)  to estimate conditional probabilities from incomplete databases in a controlled experiment. Results show a substantial equivalence of the estimates provided by the three methods and a dramatic gain in ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
This paper compares three methods  em algorithm, Gibbs sampling, and Bound and Collapse (bc)  to estimate conditional probabilities from incomplete databases in a controlled experiment. Results show a substantial equivalence of the estimates provided by the three methods and a dramatic gain in efficiency using bc. Reprinted from: Proceedings of Uncertainty 99: Seventh International Workshop on Artificial Intelligence and Statistics, Morgan Kaufmann, San Mateo, CA, 1999. Address: Marco Ramoni, Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom MK7 6AA. phone: +44 (1908) 655721, fax: +44 (1908) 653169, email: m.ramoni@open.ac.uk, url: http://kmi.open.ac.uk/people/marco. Learning Conditional Probabilities from Incomplete Data: An Experimental Comparison Marco Ramoni Knowledge Media Institute The Open University Paola Sebastiani Statistics Department The Open University Abstract This paper compares three methods  em algorithm, Gibbs sampling, an...
Bayesian Methods
, 1999
"... Introduction Classical statistics provides methods to analyze data, from simple descriptive measures to complex and sophisticated models. The available data are processed and then conclusions about a hypothetical population  of which the data available are supposed to be a representative sample ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Introduction Classical statistics provides methods to analyze data, from simple descriptive measures to complex and sophisticated models. The available data are processed and then conclusions about a hypothetical population  of which the data available are supposed to be a representative sample  are drawn. It is not hard to imagine situations, however, in which data are not the only available source of information about the population. Suppose, for example, we need to guess the outcome of an experiment that consists of tossing a coin. How many biased coins have we ever seen? Probably not many, and hence we are ready to believe that the coin is fair and that the outcome of the experiment can be either head or tail with the same probability. On the other hand, imagine that someone would tell us that the coin is forged so that it is more likely to land head. How can we take into account this information in the analysis of our data? This question becomes critical when we are consi
Building Quality Estimation models with Fuzzy Threshold Values
 IN “L’OBJET
, 2001
"... This work presents an approach to circumvent one of the major problems with techniques to build and apply software quality estimation models, namely the use of precise metric thresholds values. We used a fuzzy logic based approach to investigate the stability of a reusable class library interface, ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
This work presents an approach to circumvent one of the major problems with techniques to build and apply software quality estimation models, namely the use of precise metric thresholds values. We used a fuzzy logic based approach to investigate the stability of a reusable class library interface, using structural metrics as stability indicators. To evaluate this new approach, we conducted a study on three versions of a commercial C++ class library. The obtained results are very promising when compared to those of two classical machine learning (ML) approaches, Top Down Induction of Decision Trees and Bayesian classifiers.
Geographical Clustering of Cancer Incidence by Means of Bayesian Networks and Conditional Gaussian Networks
, 2001
"... With the aim of improving knowledge on the geographical distribution and characterization of malignant tumors in the Autonomous Communityofthe Basque Country (Spain), agestandardized cancer incidence rates of the 6 most frequent cancer types for patients of each sex between 1986 and 1994 are analyz ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
With the aim of improving knowledge on the geographical distribution and characterization of malignant tumors in the Autonomous Communityofthe Basque Country (Spain), agestandardized cancer incidence rates of the 6 most frequent cancer types for patients of each sex between 1986 and 1994 are analyzed, in relation to the towns of the Community. Concretely, we perform a geographical clustering of the towns of the Community by means of Bayesian networks and conditional Gaussian networks. We present several maps that show the clusterings encoded by the learnt models. In addition to this, we outline the cancer incidence profile for each of the obtained clusters.
Learning Bayesian Networks from Incomplete Data: An Efficient Method for Generating Approximate Predictive Distributions Abstract
"... We present an efficient method for learning Bayesian network models and parameters from incomplete data. With our approach an approximation is obtained of the predictive distribution. By way of this distribution any learning algorithm that works for complete data can be easily adapted to work for in ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
We present an efficient method for learning Bayesian network models and parameters from incomplete data. With our approach an approximation is obtained of the predictive distribution. By way of this distribution any learning algorithm that works for complete data can be easily adapted to work for incomplete data as well. Our method exploits the dependence relations between the variables explicitly given by the Bayesian network model to predict missing values. Based on strength of influence and predictive quality, a subset of those predictor variables is selected, from which an approximate predictive distribution is generated by taking the observed part of the data into consideration. The approximate predictive distribution is obtained by traversing the data sample only twice and no iteration is required. Therefore our algorithm is more efficient than iterative algorithms such as EM and SEM. Our experiments show that the method performs well both for parameter learning and model learning compared to EM and SEM. 1
Optimal Parametric Density Estimation by Minimizing an Analytic Distance
 Measure,” 10th International Conference on Information Fusion
, 2007
"... ..."
(Show Context)
A survey of Bayesian Data Mining  Part I: Discrete and semidiscrete Data Matrices
 SICS TR T99:08, ISSN 11003154, ISRN:SICST99/08SE
, 1999
"... This tutorial summarises the use of Bayesian analysis and Bayes factors for nding signicant properties of discrete (categorical and ordinal) data. It overviews methods for nding dependencies and graphical models, latent variables, robust decision trees and association rules. ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
This tutorial summarises the use of Bayesian analysis and Bayes factors for nding signicant properties of discrete (categorical and ordinal) data. It overviews methods for nding dependencies and graphical models, latent variables, robust decision trees and association rules.
Learning the Tree Augmented Naive Bayes Classifier from incomplete datasets
"... The Bayesian network formalism is becoming increasingly popular in many areas such as decision aid or diagnosis, in particular thanks to its inference capabilities, even when data are incomplete. For classification tasks, Naive Bayes and Augmented Naive Bayes classifiers have shown excellent perform ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
The Bayesian network formalism is becoming increasingly popular in many areas such as decision aid or diagnosis, in particular thanks to its inference capabilities, even when data are incomplete. For classification tasks, Naive Bayes and Augmented Naive Bayes classifiers have shown excellent performances. Learning a Naive Bayes classifier from incomplete datasets is not difficult as only parameter learning has to be performed. But there are not many methods to efficiently learn Tree Augmented Naive Bayes classifiers from incomplete datasets. In this paper, we take up the structural em algorithm principle introduced by (Friedman, 1997) to propose an algorithm to answer this question. 1
Use of Bayesian Network in Information Extraction from Unstructured Data Sources
"... Abstract—This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniqu ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract—This paper applies Bayesian Networks to support information extraction from unstructured, ungrammatical, and incoherent data sources for semantic annotation. A tool has been developed that combines ontologies, machine learning, and information extraction and probabilistic reasoning techniques to support the extraction process. Data acquisition is performed with the aid of knowledge specified in the form of ontology. Due to the variable size of information available on different data sources, it is often the case that the extracted data contains missing values for certain variables of interest. It is desirable in such situations to predict the missing values. The methodology, presented in this paper, first learns a Bayesian network from the training data and then uses it to predict missing data and to resolve conflicts. Experiments have been conducted to analyze the performance of the presented methodology. The results look promising as the methodology achieves high degree of precision and recall for information extraction and reasonably good accuracy for predicting missing values.