Minimum Message Length and Kolmogorov Complexity
 Computer Journal
, 1999
Abstract

this paper is to describe some of the relationships among the different streams and to try to clarify some of the important differences in their assumptions and development. Other studies mentioning the relationships appear in [1, Section IV, pp. 10381039], [2, sections 5.2, 5.5] and [3, p. 465]
Dynamic clustering using particle swarm optimization with application in unsupervised image segmentation
 2005
Abstract

A new dynamic clustering approach (DCPSO), based on Particle Swarm Optimization, is proposed. This approach is applied to unsupervised image classification. The proposed approach automatically determines the "optimum " number of clusters and simultaneously clusters the data set with minimal user interference. The algorithm starts by partitioning the data set into a relatively large number of clusters to reduce the effects of initial conditions. Using binary particle swarm optimization the "best" number of clusters is selected. The centers of the chosen clusters is then refined via the Kmeans clustering algorithm. The experiments conducted show that the proposed approach generally found the "optimum" number of clusters on the tested images.
MML mixture modelling of multistate, Poisson, von Mises circular and Gaussian distributions
 In Proc. 6th Int. Workshop on Artif. Intelligence and Statistics
, 1997
Abstract

Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also consistent and efficient. We provide a brief overview of MML inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)), and how it has both an informationtheoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob (Wallace and Boulton (1968), Wallace (1986), Wallace and Dowe(1994)) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components. The message length is (to within a constant) the logarithm of the posterior probability of the theory. So, the MML theory can also be regarded as the theory with the highest posterior probability. Snob currently assumes that variables are uncorrelated, and permits multivariate data from Gaussian, discrete multistate, Poisson and von Mises circular dist...
Intrinsic Classification by MML—the Snob Program
 Proc. Seventh Australian Joint Conf. Artificial Intelligence
, 1994
Abstract

Abstract: We provide a brief overview ofMinimum Message Length (MML) inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)). We then outline how MML is used for statistical parameter estimation, and how the MML intrinsic classification program, Snob (Wallace and Boulton (1968), Wallace (1986), Wallace (1990)) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with model selection in intrinsic classification. We mention here the most recent extensions to Snob, permitting Poisson and von Mises circular distributions. We also survey some applications of Snob (albeit briefly), and further provide some documentation on how the user can guide Snob’s search through various models of the given data to try to obtain that model whose message length is a minimum.
CIRCULAR CLUSTERING BY MINIMUM MESSAGE LENGTH OF PROTEIN DIHEDRAL ANGLES
, 1995
Abstract

Early work on proteins identified the existence of helices and extended sheets in protein secondary structures, a highlevel classification which remains popular today. Using the Snob program for informationtheoretic Minimum Message Length (MML) intrinsic classification, we are able to take the protein dihedral angles as determined by Xray crystallography, and cluster sets of dihedral angles into groups. Previous work by Hunter and States had applied a similar Bayesian classification method, AutoClass, to protein data with site position represented by 3 Cartesian coordinates for each of the αCarbon, βCarbon and Nitrogen, totalling 9 coordinates. By using the von Mises circular distribution in the Snob program rather than the Normal distribution in the Hunter and States model, we are instead able to represent local site properties by the two dihedral angles, φ and ψ. Since each site can be modelled as having 2 degrees of freedom, this orientationinvariant dihedral angle representation of the data is more compact than that of nine highlycorrelated Cartesian coordinates. Using the informationtheoretic message length concepts discussed in the paper, such a more concise model is more likely to represent the underlying generating process from which the data comes. We report on the results of our classification, plotting the classes in (φ,ψ)space and introducing a symmetric informationtheoretic distance measure to build a minimum spanning tree between the classes. We also give a transition matrix between the classes and note the existence of three classes in the region φ ≈−1. 09 rad and ψ ≈−0. 75 rad which are close on the spanning tree and have high intertransition probabilities. These properties give rise to a tight, abundant, selfperpetuating, αhelical structure.
Applying the EMalgorithm to Classification of Bacteria
 Proceedings of the International ICSC Congress on Intelligent Systems and Applications
, 2000
Abstract

In present paper we study the use of the expectation maximization (EM) algorithm in classification. The EMalgorithm is used to calculate the probability of each vector belonging to each class. If we assign each vector to the class of maximal probability we get a classification minimizing a certain loglikelihood function. By analyzing these probabilities we get a clearer picture of how well data fits to the classification than by traditional classification methods. We define a vector to be well classified in the classification if its probability of belonging to some class is above a prescribed value 1 \Gamma ffl. Then we set up the experimental procedure to filter out elements that are not well classified in a large data set describing strains of bacteria belonging to the family Enterobacteriaceae. We compare classifications with subset of the data (containing only well classified elements) to classifications done with randomly chosen subsets. We note that classifications done with w...
BinClass: A Software Package for Classifying Binary Vectors User's Guide
Abstract

In this document we introduce a software package BinClass for the classification of binary vectors and analysis of the classification results. First we will give brief introduction to the mathematical foundations and theory of clustering, cumulative classification and mixture classification. We also introduce methods for analysis of the classifications including trees (dendrograms) , comparison of the classifications and bootstrapping. A few pseudoalgorithms are presented. These methods are included in the software package. The third and fourth chapters are the user's guide to the actual software package. Finally a short sample session is presented to give insight into how the software actually works and to illustrate the function of some of the many parameters. Apart from being a user's guide to the software package, this document can be seen as a review and tutorial to classification methodology of binary data. This is due to extensive research done on the subject at our department.
Clustering Using the Minimum Message Length Criterion and Simulated Annealing
 in Proceedings of the 3 rd International A.I. Workshop
Abstract

Clustering has many uses such as the generation of taxonomies and concept formation. It is essentially a search through a model space to maximise a given criterion. The criterion aims to guide the search to find models that are suitable for a purpose. The search's aim is to efficiently and consistently find the model that gives the optimal criterion value. Considerable research has occurred into the criteria to use but minimal research has studied how to best search the model space. We describe how we have used simulated annealing to search the model space to optimise the minimum message length criterion.
Evolving a Fuzzy RuleBase for Image Segmentation
Abstract

Abstract—A new method for color image segmentation using fuzzy logic is proposed in this paper. Our aim here is to automatically produce a fuzzy system for color classification and image segmentation with least number of rules and minimum error rate. Particle swarm optimization is a sub class of evolutionary algorithms that has been inspired from social behavior of fishes, bees, birds, etc, that live together in colonies. We use comprehensive learning particle swarm optimization (CLPSO) technique to find optimal fuzzy rules and membership functions because it discourages premature convergence. Here each particle of the swarm codes a set of fuzzy rules. During evolution, a population member tries to maximize a fitness criterion which is here high classification rate and small number of rules. Finally, particle with the highest fitness value is selected as the best set of fuzzy rules for image segmentation. Our results, using this method for soccer field image segmentation in Robocop contests shows 89 % performance. Less computational load is needed when using this method compared with other methods like ANFIS, because it generates a smaller number of fuzzy rules. Large train dataset and its variety, makes the proposed method invariant to illumination noise Keywords—Comprehensive learning Particle Swarm optimization, fuzzy classification. I.