Results 1  10
of
12
Minimum Message Length and Kolmogorov Complexity
 Computer Journal
, 1999
"... this paper is to describe some of the relationships among the different streams and to try to clarify some of the important differences in their assumptions and development. Other studies mentioning the relationships appear in [1, Section IV, pp. 10381039], [2, sections 5.2, 5.5] and [3, p. 465] ..."
Abstract

Cited by 104 (25 self)
 Add to MetaCart
this paper is to describe some of the relationships among the different streams and to try to clarify some of the important differences in their assumptions and development. Other studies mentioning the relationships appear in [1, Section IV, pp. 10381039], [2, sections 5.2, 5.5] and [3, p. 465]
MML clustering of multistate, Poisson, von Mises circular and Gaussian distributions
 Statistics Computing
, 2000
"... Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference ..."
Abstract

Cited by 32 (10 self)
 Add to MetaCart
Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also statistically consistent and efficient. We provide a brief overview of MML inductive inference
A Fast and Robust General Purpose Clustering Algorithm
 In Pacific Rim International Conference on Artificial Intelligence
, 2000
"... General purpose and highly applicable clustering methods are usually required during the early stages of knowledge discovery exercises. kMeans has been adopted as the prototype of iterative modelbased clustering because of its speed, simplicity and capability to work within the format of very larg ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
General purpose and highly applicable clustering methods are usually required during the early stages of knowledge discovery exercises. kMeans has been adopted as the prototype of iterative modelbased clustering because of its speed, simplicity and capability to work within the format of very large databases. However, kMeans has several disadvantages derived from its statistical simplicity. We propose an algorithm that remains very efficient, generally applicable, multidimensional but is more robust to noise and outliers. We achieve this by using the discrete median rather than the mean as the estimator of the center of a cluster. Comparison with kMeans, Expectation Maximization and Gibbs sampling demonstrates the advantages of our algorithm.
Bayes not Bust! Why Simplicity is no Problem for Bayesians
, 2007
"... The advent of formal definitions of the simplicity of a theory has important implications for model selection. But what is the best way to define simplicity? Forster and Sober ([1994]) advocate the use of Akaike’s Information Criterion (AIC), a nonBayesian formalisation of the notion of simplicity. ..."
Abstract

Cited by 13 (10 self)
 Add to MetaCart
The advent of formal definitions of the simplicity of a theory has important implications for model selection. But what is the best way to define simplicity? Forster and Sober ([1994]) advocate the use of Akaike’s Information Criterion (AIC), a nonBayesian formalisation of the notion of simplicity. This forms an important part of their wider attack on Bayesianism in the philosophy of science. We defend a Bayesian alternative: the simplicity of a theory is to be characterised in terms of Wallace’s Minimum Message Length (MML). We show that AIC is inadequate for many statistical problems where MML performs well. Whereas MML is always defined, AIC can be undefined. Whereas MML is not known ever to be statistically inconsistent, AIC can be. Even when defined and consistent, AIC performs worse than MML on small sample sizes. MML is statistically invariant under 1to1 reparametrisation, thus avoiding a common criticism of Bayesian approaches. We also show that MML provides answers to many of Forster’s objections to Bayesianism. Hence an important part of the attack on
Robust DistanceBased Clustering with Applications to Spatial Data Mining
, 1999
"... In this paper, we present a method for clustering georeferenced data suitable for applications in spatial data mining, based on the medoid method. The medoid method is related to kMeans, with the restriction that cluster representatives be chosen from among the data elements. Although the medoid m ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
In this paper, we present a method for clustering georeferenced data suitable for applications in spatial data mining, based on the medoid method. The medoid method is related to kMeans, with the restriction that cluster representatives be chosen from among the data elements. Although the medoid method in general produces clusters of high quality, especially in the presence of noise, it is often criticized for the\Omega\Gamma n 2 ) time that it requires. Our method incorporates both proximity and density information to achieve highquality clusters in subquadratic time; it does not require that the user specify the number of clusters in advance. The time bound is achieved by means of a fast approximation to the medoid objective function, using Delaunay triangulations to store proximity information.
Minimum Message Length Grouping of Ordered Data
 Proceedings of the Eleventh International Conference on Algorithmic Learning Theory (ALT2000), LNAI
, 2000
"... Explicit segmentation is the partitioning of data into homogeneous regions by specifying cutpoints. W. D. Fisher (1958) gave an early example of explicit segmentation based on the minimisation of squared error. Fisher called this the grouping problem and came up with a polynomial time Dynamic Progr ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Explicit segmentation is the partitioning of data into homogeneous regions by specifying cutpoints. W. D. Fisher (1958) gave an early example of explicit segmentation based on the minimisation of squared error. Fisher called this the grouping problem and came up with a polynomial time Dynamic Programming Algorithm (DPA). Oliver, Baxter and colleagues (1996,1997,1998) have applied the informationtheoretic Minimum Message Length (MML) principle to explicit segmentation. Given a series of multivariate data, approximate it by a piecewise constant function. How many cutpoints are there? What are the means and variances of each segment? Where should the cut points be placed? The simplest model is a single segment. The most complex model has one segment per data point. The best model is generally somewhere between these extremes. Only by considering model complexity can a reasonable inference be made.
ChangePoint Estimation Using New Minimum Message Length Approximations
 Proc. PRICAI
, 2002
"... This paper investigates the coding of changepoints in the informationtheoretic Minimum Message Length (MML) framework. Changepoint coding regions affect model selection and parameter estimation in problems such as time series segmentation and decision trees. The Minimum Message Length (MML) and Mi ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
This paper investigates the coding of changepoints in the informationtheoretic Minimum Message Length (MML) framework. Changepoint coding regions affect model selection and parameter estimation in problems such as time series segmentation and decision trees. The Minimum Message Length (MML) and Minimum Description Length (MDL78) approaches to changepoint problems have been shown to perform well by several authors. In this paper we compare some published MML and MDL78 methods and introduce some new MML approximations called `MMLDc' and `MMLDF'. These new approximations are empirically compared with Strict MML (SMML), Fairly Strict MML (FSMML), MML68, the Minimum Expected KullbackLeibler Distance (MEKLD) loss function and MDL78 on a tractable binomial changepoint problem.
MML, HYBRID BAYESIAN NETWORK GRAPHICAL MODELS, STATISTICAL CONSISTENCY, INVARIANCE AND UNIQUENESS
"... The problem of statistical — or inductive — inference pervades a large number of human activities and a large number of (human and nonhuman) actions requiring ‘intelligence’. Human and other ‘intelligent ’ activity often entails making inductive inferences, remembering and recording observations fr ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
The problem of statistical — or inductive — inference pervades a large number of human activities and a large number of (human and nonhuman) actions requiring ‘intelligence’. Human and other ‘intelligent ’ activity often entails making inductive inferences, remembering and recording observations from which one can make
Bayesian Posterior Comprehension via Message from Monte Carlo
, 2003
"... We discuss the problem of producing an epitome, or brief summary, of a Bayesian posterior distribution  and then investigate a general solution based on the Minimum Message Length (MML) principle. Clearly, the optimal criterion for choosing such an epitome is determined by the epitome's intended us ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We discuss the problem of producing an epitome, or brief summary, of a Bayesian posterior distribution  and then investigate a general solution based on the Minimum Message Length (MML) principle. Clearly, the optimal criterion for choosing such an epitome is determined by the epitome's intended use. The interesting general case is where this use is unknown since, in order to be practical, the choice of epitome criterion becomes subjective. We identify a number of desirable properties that an epitome could have  facilitation of point estimation, human comprehension, and fast approximation of posterior expectations. We call these the properties of Bayesian Posterior Comprehension and show that the Minimum Message Length principle can be viewed as an epitome criterion that produces epitomes having these properties. We then present and extend Message from Monte Carlo as a means for constructing instantaneous Minimum Message Length codebooks (and epitomes) using Markov Chain Monte Carlo methods. The Message from Monte Carlo methodology is illustrated for binary regression, generalised linear model, and multiple changepoint problems.
Convex Group Clustering of Large Georeferenced Data Sets
 In Abstracts for the Eleventh Canadian Conference on Computational Geometry (CCCG'99
, 1999
"... Clustering partitions a data set S = fs1 ; : : : ; sng ae ! m into groups of nearby points. Distancebased clustering uses optimisation criteria for defining the quality of the partition. Formulations using representatives (means or medians of groups) have received much more attention than minimis ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Clustering partitions a data set S = fs1 ; : : : ; sng ae ! m into groups of nearby points. Distancebased clustering uses optimisation criteria for defining the quality of the partition. Formulations using representatives (means or medians of groups) have received much more attention than minimisation of the total within group distance (TWGD). However, this nonrepresentative approach has attractive properties while remaining distancebased. While representative approaches produce partitions with nonoverlapping clusters, TWGD does not. We investigate the restriction of TWGD to producing convexhull disjoint groups and show that this problem is NPcomplete in the Euclidean case as soon as m 2. Nevertheless we provide efficient algorithms for solving it approximately. Keywords: clustering, optimisation, computational geometry, problem complexity, data mining in spatial databases. 1 Introduction Clustering is a fundamental task in data analysis since it identifies groups in heterog...