Bayesian Approaches to Gaussian Mixture Modelling
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... A Bayesianbased methodology is presented which automatically penalises overcomplex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Baye ..."
Cited by 73 (2 self)
A Bayesianbased methodology is presented which automatically penalises overcomplex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Bayesian method is compared to other methods of optimal model selection and found to give good results. The methods are tested on synthetic and real data sets. Introduction Scientific disciplines generate data. In the attempt to understand the patterns present in such data sets methods which perform some form of unsupervised partitioning or modelling are particularly useful. Such an approach is only of use, however, if it offers a less complex representation of the data than the data set itself. This introduces an apparent conflict, however, as any model improves its fit to the data monotonically with increases in its complexity (the number of model parameters)  a model as complex as the data...
On predictive distributions and Bayesian networks
 Statistics and Computing
, 2000
"... this paper we are interested in discrete prediction problems for a decisiontheoretic setting, where the ..."
Cited by 38 (29 self)
this paper we are interested in discrete prediction problems for a decisiontheoretic setting, where the
On Pruning and Averaging Decision Trees
 In Proceedings of the Twelfth International Conference on Machine Learning
, 1995
"... Pruning a decision tree is considered by some researchers to be the most important part of tree building in noisy domains. While, there are many approaches to pruning, an alternative approach of averaging over decision trees has not received as much attention. We perform an empirical comparison of p ..."
Cited by 37 (0 self)
Pruning a decision tree is considered by some researchers to be the most important part of tree building in noisy domains. While, there are many approaches to pruning, an alternative approach of averaging over decision trees has not received as much attention. We perform an empirical comparison of pruning with the approach of averaging over decision trees. For this comparison we use a computationally efficient method of averaging, namely averaging over the extended fanned set of a tree. Since there are a wide range of approaches to pruning, we compare tree averaging with a traditional pruning approach, along with an optimal pruning approach.
Introduction to Minimum Encoding Inference
 DEPT. OF STATISTICS, OPEN UNIVERSITY, WALTON HALL, MILTON
, 1994
"... This paper examines the minimumencoding approaches to inference, Minimum Message Length (MML) and Minimum Description Length (MDL). This paper was written with the objective of providing an introduction to this area for statisticians. We describe coding techniques for data, and examine how these tec ..."
Cited by 23 (4 self)
This paper examines the minimumencoding approaches to inference, Minimum Message Length (MML) and Minimum Description Length (MDL). This paper was written with the objective of providing an introduction to this area for statisticians. We describe coding techniques for data, and examine how these techniques can be applied to perform inference and model selection.
Minimum Encoding Approaches for Predictive Modeling
 Proceedings of the 14th International Conference on Uncertainty in Artificial Intelligence (UAI'98
, 1998
"... We analyze differences between two informationtheoretically motivated approaches to statistical inference and model selection: the Minimum Description Length (MDL) principle, and the Minimum Message Length (MML) principle. Based on this analysis, we present two revised versions of MML: a pointwise ..."
Cited by 19 (13 self)
We analyze differences between two informationtheoretically motivated approaches to statistical inference and model selection: the Minimum Description Length (MDL) principle, and the Minimum Message Length (MML) principle. Based on this analysis, we present two revised versions of MML: a pointwise estimator which gives the MMLoptimal single parameter model, and a volumewise estimator which gives the MMLoptimal region in the parameter space. Our empirical results suggest that with small data sets, the MDL approach yields more accurate predictions than the MML estimators. The empirical results also demonstrate that the revised MML estimators introduced here perform better than the original MML estimator suggested by Wallace and Freeman. 1 INTRODUCTION Two related but distinct approaches to statistical inference and model selection are the Minimum Description Length (MDL) principle (Rissanen, 1978, 1987, 1996), and the Minimum Message Length (MML) principle (Wallace & Boulton, 1968; W...
Circular Clustering Of Protein Dihedral Angles By Minimum Message Length
 In Proceedings of the 1st Pacific Symposium on Biocomputing (PSB1
, 1996
"... this paper is given in [DADH95] and is available from ftp://www.cs.monash.edu.au/www/publications/1995/TR237.ps.Z.) Section 2introduces the MML principle and how it can be used for this circular clustering problem. The remaining sections give the results of the secondary structure groups [KaSa83] th ..."
Cited by 14 (11 self)
this paper is given in [DADH95] and is available from ftp://www.cs.monash.edu.au/www/publications/1995/TR237.ps.Z.) Section 2introduces the MML principle and how it can be used for this circular clustering problem. The remaining sections give the results of the secondary structure groups [KaSa83] that resulted from applying Snob to cluster our dihedral angle data.
Automatic Bias Learning: An Inquiry into the Inductive Basis of Induction
, 1999
"... This thesis combines an epistemological concern about induction with a computational exploration of inductive mechanisms. It aims to investigate how inductive performance could be improved by using induction to select appropriate generalisation procedures. The thesis revolves around a metalearning ..."
Cited by 10 (5 self)
This thesis combines an epistemological concern about induction with a computational exploration of inductive mechanisms. It aims to investigate how inductive performance could be improved by using induction to select appropriate generalisation procedures. The thesis revolves around a metalearning system, called designed to investigate how inductive performances could be improved by using induction to select appropriate generalisation procedures. The performance of is discussed against the background of epistemological issues concerning induction, such as the role of theoretical vocabularies and the value of simplicity.
Minimum Message Length Grouping of Ordered Data
 Proceedings of the Eleventh International Conference on Algorithmic Learning Theory (ALT2000), LNAI
, 2000
"... Explicit segmentation is the partitioning of data into homogeneous regions by specifying cutpoints. W. D. Fisher (1958) gave an early example of explicit segmentation based on the minimisation of squared error. Fisher called this the grouping problem and came up with a polynomial time Dynamic Progr ..."
Cited by 7 (4 self)
Explicit segmentation is the partitioning of data into homogeneous regions by specifying cutpoints. W. D. Fisher (1958) gave an early example of explicit segmentation based on the minimisation of squared error. Fisher called this the grouping problem and came up with a polynomial time Dynamic Programming Algorithm (DPA). Oliver, Baxter and colleagues (1996,1997,1998) have applied the informationtheoretic Minimum Message Length (MML) principle to explicit segmentation. Given a series of multivariate data, approximate it by a piecewise constant function. How many cutpoints are there? What are the means and variances of each segment? Where should the cut points be placed? The simplest model is a single segment. The most complex model has one segment per data point. The best model is generally somewhere between these extremes. Only by considering model complexity can a reasonable inference be made.
Groupwise nonrigid registration: The minimum description length approach
 PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE (BMVC
, 2004
"... The principled nonrigid registration of groups of images requires a fully groupwise objective function. We consider the problem as one of finding the optimal dense correspondence between the images in the set, where optimality is defined using the Minimum Description Length (MDL) principle, that th ..."
Cited by 6 (2 self)
The principled nonrigid registration of groups of images requires a fully groupwise objective function. We consider the problem as one of finding the optimal dense correspondence between the images in the set, where optimality is defined using the Minimum Description Length (MDL) principle, that the transmission of a model of the data, together with the parameters of that model, should be as short as possible. We demonstrate that this approach provides a suitable objective function by applying it to the task of nonrigid registration of a set of 2D T1weighted MR images of the human brain. Furthermore, we show that even in the case when substantial portions of the images are missing, the algorithm not only converges to the correct solution, but also allows meaningful integration of image data across the training set, allowing the original image to be reconstructed.
MML and Bayesianism: Similarities and Differences (Introduction to Minimum Encoding Inference  Part II)
, 1994
"... This paper continues the introduction to minimum encoding inference given by Oliver and Hand. This series of papers were written with the objective of providing an introduction to this area for statisticians. We examine the relationship between Bayesianism and Minimum Message Length (MML) inference. ..."
Cited by 6 (0 self)
This paper continues the introduction to minimum encoding inference given by Oliver and Hand. This series of papers were written with the objective of providing an introduction to this area for statisticians. We examine the relationship between Bayesianism and Minimum Message Length (MML) inference. We argue that MML augments Bayesian methods by providing a sound Bayesian method for point estimation which is invariant under nonlinear transformations. We explore the issues of invariance of estimators under nonlinear transformations, the role of the Fisher Information matrix in MML inference, and the apparent similarity between MML and the adoption of a Jeffreys' Prior. We then compare MML to an approximate method of Bayesian Model Class Selection. Despite apparent similarities in their expressions, the properties of the two approaches can be different.