Results 1  10
of
15
A Guide to the Literature on Learning Probabilistic Networks From Data
, 1996
"... This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the ..."
Abstract

Cited by 171 (0 self)
 Add to MetaCart
This literature review discusses different methods under the general rubric of learning Bayesian networks from data, and includes some overlapping work on more general probabilistic networks. Connections are drawn between the statistical, neural network, and uncertainty communities, and between the different methodological communities, such as Bayesian, description length, and classical statistics. Basic concepts for learning and Bayesian networks are introduced and methods are then reviewed. Methods are discussed for learning parameters of a probabilistic network, for learning the structure, and for learning hidden variables. The presentation avoids formal definitions and theorems, as these are plentiful in the literature, and instead illustrates key concepts with simplified examples. Keywords Bayesian networks, graphical models, hidden variables, learning, learning structure, probabilistic networks, knowledge discovery. I. Introduction Probabilistic networks or probabilistic gra...
Bayesian Approaches to Gaussian Mixture Modelling
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... A Bayesianbased methodology is presented which automatically penalises overcomplex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Baye ..."
Abstract

Cited by 73 (2 self)
 Add to MetaCart
A Bayesianbased methodology is presented which automatically penalises overcomplex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Bayesian method is compared to other methods of optimal model selection and found to give good results. The methods are tested on synthetic and real data sets. Introduction Scientific disciplines generate data. In the attempt to understand the patterns present in such data sets methods which perform some form of unsupervised partitioning or modelling are particularly useful. Such an approach is only of use, however, if it offers a less complex representation of the data than the data set itself. This introduces an apparent conflict, however, as any model improves its fit to the data monotonically with increases in its complexity (the number of model parameters)  a model as complex as the data...
Unsupervised Learning Using MML
 IN MACHINE LEARNING: PROCEEDINGS OF THE THIRTEENTH INTERNATIONAL CONFERENCE (ICML 96
, 1996
"... This paper discusses the unsupervised learning problem. An important part of the unsupervised learning problem is determining the number of constituent groups (components or classes) which best describes some data. We apply the Minimum Message Length (MML) criterion to the unsupervised learning prob ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
This paper discusses the unsupervised learning problem. An important part of the unsupervised learning problem is determining the number of constituent groups (components or classes) which best describes some data. We apply the Minimum Message Length (MML) criterion to the unsupervised learning problem, modifying an earlier such MML application. We give an empirical comparison of criteria prominent in the literature for estimating the number of components in a data set. We conclude that the Minimum Message Length criterion performs better than the alternatives on the data considered here for unsupervised learning tasks.
Introduction to Minimum Encoding Inference
 DEPT. OF STATISTICS, OPEN UNIVERSITY, WALTON HALL, MILTON
, 1994
"... This paper examines the minimumencoding approaches to inference, Minimum Message Length (MML) and Minimum Description Length (MDL). This paper was written with the objective of providing an introduction to this area for statisticians. We describe coding techniques for data, and examine how these tec ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
This paper examines the minimumencoding approaches to inference, Minimum Message Length (MML) and Minimum Description Length (MDL). This paper was written with the objective of providing an introduction to this area for statisticians. We describe coding techniques for data, and examine how these techniques can be applied to perform inference and model selection.
H.: Causal discovery via MML
 In: Proceedings of the Thirteenth International Conference on Machine Learning
, 1996
"... Automating the learning of causal models from sample data is a key step toward incorporating machine learning into decisionmaking and reasoning under uncertainty. This paper presents a Bayesian approach to the discovery of causal models, using a Minimum Message Length (MML) method. We have developed ..."
Abstract

Cited by 20 (10 self)
 Add to MetaCart
Automating the learning of causal models from sample data is a key step toward incorporating machine learning into decisionmaking and reasoning under uncertainty. This paper presents a Bayesian approach to the discovery of causal models, using a Minimum Message Length (MML) method. We have developed encoding and search methods for discovering linear causal models. The initial experimental results presented in this paper show that the MML induction approach can recover causal models from generated data which are quite accurate re ections of the original models and compare favorably with those of TETRAD II (Spirtes et al. 1994) even when it is supplied with prior temporal information and MML is not. 1
Bayesian Estimation Of The Von Mises Concentration Parameter
 PROCEEDINGS OF THE FIFTEENTH INTERNATIONAL WORKSHOP ON MAXIMUM ENTROPY AND BAYESIAN METHODS
"... The von Mises distribution is a maximum entropy distribution. It corresponds to the distribution of an angle of a compass needle in a uniform magnetic field of direction, , with concentration parameter, . The concentration parameter, , is the ratio of the field strength to the temperature of thermal ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
The von Mises distribution is a maximum entropy distribution. It corresponds to the distribution of an angle of a compass needle in a uniform magnetic field of direction, , with concentration parameter, . The concentration parameter, , is the ratio of the field strength to the temperature of thermal fluctuations. Previously, we obtained a Bayesian estimator for the von Mises distribution parameters using the informationtheoretic MinimumMessage Length (MML) principle. Here, we examine a variety of Bayesian estimation techniques by examining the posterior distribution in both polar and Cartesian coordinates. We compare the MML estimator with these fellow Bayesian techniques, and a range of Classical estimators. We find that the Bayesian estimators outperform the Classical estimators.
A unified view on clustering binary data
 Machine Learning
"... Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This paper studies the problem of clustering binary data. Binary data have been occupying a special place in the domain of dat ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This paper studies the problem of clustering binary data. Binary data have been occupying a special place in the domain of data analysis. A unified view of binary data clustering is presented by examining the connections among various clustering criteria. Experimental studies are conducted to empirically verify the relationships. 1
MDL and MML: Similarities and Differences (Introduction to Minimum Encoding Inference  Part III)
, 1994
"... This paper continues the introduction to minimum encoding inductive inference given by Oliver and Hand. This series of papers was written with the objective of providing an introduction to this area for statisticians. We describe the message length estimates used in Wallace's Minimum Message Length ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
This paper continues the introduction to minimum encoding inductive inference given by Oliver and Hand. This series of papers was written with the objective of providing an introduction to this area for statisticians. We describe the message length estimates used in Wallace's Minimum Message Length (MML) inference and Rissanen's Minimum Description Length (MDL) inference. The differences in the message length estimates of the two approaches are explained. The implications of these differences for applications are discussed.
Bayesian Approaches to Segmenting a Simple Time Series
, 1997
"... The segmentation problem arises in many applications in data mining, A.I. and statistics. In this paper, we consider segmenting simple time series. We develop two Bayesian approaches for segmenting a time series, namely the Bayes Factor approach, and the Minimum Message Length (MML) approach. We per ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
The segmentation problem arises in many applications in data mining, A.I. and statistics. In this paper, we consider segmenting simple time series. We develop two Bayesian approaches for segmenting a time series, namely the Bayes Factor approach, and the Minimum Message Length (MML) approach. We perform simulations comparing these Bayesian approaches, and then perform a comparison with other classical approaches, namely AIC, MDL and BIC. We conclude that the MML criterion is the preferred criterion. We then apply the segmentation method to financial time series data. 1 Introduction In this paper, we consider the problem of segmenting simple time series. We consider time series of the form: y t+1 = y t + ¯ j + ffl t where we are given N data points (y 1 : : : ; yN ) and we assume there are C + 1 segments (j 2 f0; : : : Cg), and that each ffl t is Gaussian with mean zero and variance oe 2 j . We wish to estimate  the number of segments, C + 1,  the segment boundaries, fv 1 ; : :...
Incremental Methods for Bayesian Network Learning
 Department de
, 1999
"... In this work we analyze the most relevant, in our opinion, algorithms for learning Bayesian Networks. We analyze methods that use goodnessoffit tests between tentative networks and data. Within this sort of learning algorithms we distinguish batch and incremental methods. Finally, we propose a sys ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
In this work we analyze the most relevant, in our opinion, algorithms for learning Bayesian Networks. We analyze methods that use goodnessoffit tests between tentative networks and data. Within this sort of learning algorithms we distinguish batch and incremental methods. Finally, we propose a system, called BANDOLER, that incrementally learns Bayesian Networks from data and prior knowledge. The incremental fashion of the system allows to modify the learning strategy and to introduce new prior knowledge during the learning process in the light of the already learnt structure. 1 Introduction The aim of this work is twofold. On the one hand, we introduce the state of the art on learning Bayesian networks. It is intended to be a tutorial on the learning methods based on goodnessoffit tests. We present the most significant, in our opinion, learning algorithms found in the literature, as well as the theory they are based on. On the other hand, we propose a research framework. The fiel...