Results 1  10
of
15
Decision Graphs  An Extension of Decision Trees
, 1993
"... : In this paper, we examine Decision Graphs, a generalization of decision trees. We present an inference scheme to construct decision graphs using the Minimum Message Length Principle. Empirical tests demonstrate that this scheme compares favourably with other decision tree inference schemes. This w ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
: In this paper, we examine Decision Graphs, a generalization of decision trees. We present an inference scheme to construct decision graphs using the Minimum Message Length Principle. Empirical tests demonstrate that this scheme compares favourably with other decision tree inference schemes. This work provides a metric for comparing the relative merit of the decision tree and decision graph formalisms for a particular domain. 1 Introduction In this paper, we examine the problem of inferring a decision procedure from a set of examples. We examine the decision graph [5, 1, 16, 15, 14], a generalization of the decision tree [3, 18], and propose a method to construct decision graphs based upon Wallace's Minimum Message Length Principle (MMLP) [24, 10, 25]. The MMLP is related to Rissanen's Minimum Description Length Principle (MDLP) [21, 22, 20]. For the reader unfamiliar with minimum encoding methods (MML and MDL), a good introduction to the area is given by Georgeff [10]. We formalize ...
A Nonbehavioural, Computational Extension to the Turing Test
 In International Conference on Computational Intelligence & Multimedia Applications (ICCIMA '98
, 1998
"... We also ask the following question: Given two programs H1 and H2 respectively of lengths l1 and l2, l1! l2, if H1 and H2 perform equally well (to date) on a Turing Test, which, if either, should be preferred for the future? We also set a challenge. If humans can presume intelligence in their ability ..."
Abstract

Cited by 33 (18 self)
 Add to MetaCart
We also ask the following question: Given two programs H1 and H2 respectively of lengths l1 and l2, l1! l2, if H1 and H2 perform equally well (to date) on a Turing Test, which, if either, should be preferred for the future? We also set a challenge. If humans can presume intelligence in their ability to set the Turing test, then we issue the additional challenge to researchers to get machines to administer the Turing Test.
Circular Clustering Of Protein Dihedral Angles By Minimum Message Length
 In Proceedings of the 1st Pacific Symposium on Biocomputing (PSB1
, 1996
"... this paper is given in [DADH95] and is available from ftp://www.cs.monash.edu.au/www/publications/1995/TR237.ps.Z.) Section 2introduces the MML principle and how it can be used for this circular clustering problem. The remaining sections give the results of the secondary structure groups [KaSa83] th ..."
Abstract

Cited by 14 (11 self)
 Add to MetaCart
this paper is given in [DADH95] and is available from ftp://www.cs.monash.edu.au/www/publications/1995/TR237.ps.Z.) Section 2introduces the MML principle and how it can be used for this circular clustering problem. The remaining sections give the results of the secondary structure groups [KaSa83] that resulted from applying Snob to cluster our dihedral angle data.
MML mixture modelling of multistate, Poisson, von Mises circular and Gaussian distributions
 In Proc. 6th Int. Workshop on Artif. Intelligence and Statistics
, 1997
"... Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also consistent and efficient. We provide a brief overview of MML inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)), and how it has both an informationtheoretic and a Bayesian interp ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also consistent and efficient. We provide a brief overview of MML inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)), and how it has both an informationtheoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mixture modelling program, Snob (Wallace and Boulton (1968), Wallace (1986), Wallace and Dowe(1994)) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with selection of the number of components. The message length is (to within a constant) the logarithm of the posterior probability of the theory. So, the MML theory can also be regarded as the theory with the highest posterior probability. Snob currently assumes that variables are uncorrelated, and permits multivariate data from Gaussian, discrete multistate, Poisson and von Mises circular dist...
Unsupervised Learning of Gamma Mixture Models Using Minimum Message Length
 Proc. Third IASTED Conf. Artificial Intelligence and Applications
, 2003
"... Mixture modelling or unsupervised classification is a problem of identifying and modelling components in a body of data. Earlier work in mixture modelling using Minimum Message Length (MML) includes the multinomial and Gaussian distributions (Wallace and Boulton, 1968), the von Mises circular and Po ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Mixture modelling or unsupervised classification is a problem of identifying and modelling components in a body of data. Earlier work in mixture modelling using Minimum Message Length (MML) includes the multinomial and Gaussian distributions (Wallace and Boulton, 1968), the von Mises circular and Poisson distributions (Wallace and Dowe, 1994, 2000) and the distribution (Agusta and Dowe, 2002a, 2002b). In this paper, we extend this research by considering MML mixture modelling using the Gamma distribution. The point estimation of the distribution was performed using the MML approximation proposed by Wallace and Freeman (1987) and gives impressive results compared to Maximum Likelihood (ML). We then considered mixture modelling on artificially generated datasets and compared the results with two other criteria, AIC and BIC. In terms of the resulting number of components, the results were again impressive. Application to the Heming Pike dataset was then examined and the results were compared in terms of the probability bitcostings, showing that the proposed MML method performs better than AIC and BIC. A further application also shows that our method works well with datasets containing leftskewed components such as the Palm Valley (Australia) image dataset.
Intrinsic Classification by MML—the Snob Program
 Proc. Seventh Australian Joint Conf. Artificial Intelligence
, 1994
"... Abstract: We provide a brief overview ofMinimum Message Length (MML) inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)). We then outline how MML is used for statistical parameter estimation, and how the MML intrinsic classification program, Snob (Wallace and Boulton (1968), ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Abstract: We provide a brief overview ofMinimum Message Length (MML) inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)). We then outline how MML is used for statistical parameter estimation, and how the MML intrinsic classification program, Snob (Wallace and Boulton (1968), Wallace (1986), Wallace (1990)) uses the message lengths from various parameter estimates to enable it to combine parameter estimation with model selection in intrinsic classification. We mention here the most recent extensions to Snob, permitting Poisson and von Mises circular distributions. We also survey some applications of Snob (albeit briefly), and further provide some documentation on how the user can guide Snob’s search through various models of the given data to try to obtain that model whose message length is a minimum.
MML, HYBRID BAYESIAN NETWORK GRAPHICAL MODELS, STATISTICAL CONSISTENCY, INVARIANCE AND UNIQUENESS
"... The problem of statistical — or inductive — inference pervades a large number of human activities and a large number of (human and nonhuman) actions requiring ‘intelligence’. Human and other ‘intelligent ’ activity often entails making inductive inferences, remembering and recording observations fr ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
The problem of statistical — or inductive — inference pervades a large number of human activities and a large number of (human and nonhuman) actions requiring ‘intelligence’. Human and other ‘intelligent ’ activity often entails making inductive inferences, remembering and recording observations from which one can make
CIRCULAR CLUSTERING BY MINIMUM MESSAGE LENGTH OF PROTEIN DIHEDRAL ANGLES
, 1995
"... Early work on proteins identified the existence of helices and extended sheets in protein secondary structures, a highlevel classification which remains popular today. Using the Snob program for informationtheoretic Minimum Message Length (MML) intrinsic classification, we are able to take the pro ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
Early work on proteins identified the existence of helices and extended sheets in protein secondary structures, a highlevel classification which remains popular today. Using the Snob program for informationtheoretic Minimum Message Length (MML) intrinsic classification, we are able to take the protein dihedral angles as determined by Xray crystallography, and cluster sets of dihedral angles into groups. Previous work by Hunter and States had applied a similar Bayesian classification method, AutoClass, to protein data with site position represented by 3 Cartesian coordinates for each of the αCarbon, βCarbon and Nitrogen, totalling 9 coordinates. By using the von Mises circular distribution in the Snob program rather than the Normal distribution in the Hunter and States model, we are instead able to represent local site properties by the two dihedral angles, φ and ψ. Since each site can be modelled as having 2 degrees of freedom, this orientationinvariant dihedral angle representation of the data is more compact than that of nine highlycorrelated Cartesian coordinates. Using the informationtheoretic message length concepts discussed in the paper, such a more concise model is more likely to represent the underlying generating process from which the data comes. We report on the results of our classification, plotting the classes in (φ,ψ)space and introducing a symmetric informationtheoretic distance measure to build a minimum spanning tree between the classes. We also give a transition matrix between the classes and note the existence of three classes in the region φ ≈−1. 09 rad and ψ ≈−0. 75 rad which are close on the spanning tree and have high intertransition probabilities. These properties give rise to a tight, abundant, selfperpetuating, αhelical structure.
Turing Tests with Turing Machines
"... Comparative tests work by finding the difference (or the absence of difference) between a reference subject and an evaluee. The Turing Test, in its standard interpretation, takes (a subset of) the human species as a reference. Motivated by recent findings and developments in the area of machine inte ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Comparative tests work by finding the difference (or the absence of difference) between a reference subject and an evaluee. The Turing Test, in its standard interpretation, takes (a subset of) the human species as a reference. Motivated by recent findings and developments in the area of machine intelligence evaluation, we discuss what it would be like to have a Turing Test where the reference and the interrogator subjects are replaced by Turing Machines. This question sets the focus on several issues that are usually disregarded when dealing with the Turing Test, such as the degree of intelligence of reference and interrogator, the role of imitation (and not only prediction) in intelligence, its view from the perspective of game theory and others. Around these issues, this paper finally brings the Turing Test to the realm of Turing machines.