Results 1 
5 of
5
Bridging the gap between distance and generalisation: Symbolic learning in metric spaces
, 2008
"... Distancebased and generalisationbased methods are two families of artificial intelligence techniques that have been successfully used over a wide range of realworld problems. In the first case, general algorithms can be applied to any data representation by just changing the distance. The metric ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Distancebased and generalisationbased methods are two families of artificial intelligence techniques that have been successfully used over a wide range of realworld problems. In the first case, general algorithms can be applied to any data representation by just changing the distance. The metric space sets the search and learning space, which is generally instanceoriented. In the second case, models can be obtained for a given pattern language, which can be comprehensible. The generalityordered space sets the search and learning space, which is generally modeloriented. However, the concepts of distance and generalisation clash in many different ways, especially when knowledge representation is complex (e.g. structured data). This work establishes a framework where these two fields can be integrated in a consistent way. We introduce the concept of distancebased generalisation, which connects all the generalised examples in such a way that all of them are reachable inside the generalisation by using straight paths in the metric space. This makes the metric space and the generalityordered space coherent (or even dual). Additionally, we also introduce a definition of minimal distancebased generalisation that can be seen as the first formulation of the Minimum Description Length (MDL)/Minimum Message Length (MML) principle in terms of a distance function. We instantiate and develop the framework for the most common data representations and distances, where we show that consistent instances can be found for numerical data, nominal data, sets, lists, tuples, graphs, firstorder atoms and clauses. As a result, general learning methods that integrate the best from distancebased and generalisationbased methods can be defined and adapted to any specific problem by appropriately choosing the distance, the pattern language and the generalisation operator.
Decision forests with oblique decision trees
, 2006
"... Ensemble learning schemes have shown impressive increases in prediction accuracy over single model schemes. We introduce a new decision forest learning scheme, whose base learners are Minimum Message Length (MML) oblique decision trees. Unlike other tree inference algorithms, MML oblique decision tr ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Ensemble learning schemes have shown impressive increases in prediction accuracy over single model schemes. We introduce a new decision forest learning scheme, whose base learners are Minimum Message Length (MML) oblique decision trees. Unlike other tree inference algorithms, MML oblique decision tree learning does not overgrow the inferred trees. The resultant trees thus tend to be shallow and do not require pruning. MML decision trees are known to be resistant to overfitting and excellent at probabilistic predictions. A novel weighted averaging scheme is also proposed which takes advantage of high probabilistic prediction accuracy produced by MML oblique decision trees. The experimental results show that the new weighted averaging offers solid improvement over other averaging schemes, such as majority vote. Our MML decision forests scheme also returns favourable results compared to other ensemble learning algorithms on data sets with binary classes.
On Oblique Random Forests
"... Abstract. In his original paper on random forests, Breiman proposed two different decision tree ensembles: one generated from “orthogonal” trees with thresholds on individual features in every split, and one from “oblique ” trees separating the feature space by randomly oriented hyperplanes. In spit ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. In his original paper on random forests, Breiman proposed two different decision tree ensembles: one generated from “orthogonal” trees with thresholds on individual features in every split, and one from “oblique ” trees separating the feature space by randomly oriented hyperplanes. In spite of a rising interest in the random forest framework, however, ensembles built from orthogonal trees (RF) have gained most, if not all, attention so far. In the present work we propose to employ “oblique ” random forests (oRF) built from multivariate trees which explicitly learn optimal split directions at internal nodes using linear discriminative models, rather than using random coefficients as the original oRF. This oRF outperforms RF, as well as other classifiers, on nearly all data sets but those with discrete factorial features. Learned node models perform distinctively better than random splits. An oRF feature importance score shows to be preferable over standard RF feature importance scores such as Gini or permutation importance. The topology of the oRF decision space appears to be smoother and better adapted to the data, resulting in improved generalization performance. Overall, the oRF propose here may be preferred over standard RF on most learning tasks involving numerical and spectral data. 1
A Preliminary MML Linear Classifier using Principal Components for Multiple Classes
"... In this paper we improve on the supervised classification method developed in Kornienko et al. (2002) by the introduction of Principal Components Analysis to the inference process. We also extend the classifier from dealing with binomial (twoclass) problems only to multinomial (multiclass) problem ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
In this paper we improve on the supervised classification method developed in Kornienko et al. (2002) by the introduction of Principal Components Analysis to the inference process. We also extend the classifier from dealing with binomial (twoclass) problems only to multinomial (multiclass) problems. The application to which the MML criterion has been applied in this paper is the classification of objects via a linear hyperplane, where the objects are able to come from any multiclass distribution. The inclusion of Principal Component Analysis to the original inference scheme reduces the bias present in the classifier’s search technique. Such improvements lead to a method which, when compared against three commercial Support Vector Machine (SVM) classifiers on Binary data, was found to be as good as the most successful SVM tested. Furthermore, the new scheme is able to classify objects of a multiclass distribution with just one hyperplane, whereas SVMs require several hyperplanes.
Stock Market Simulation and Inference Technique
"... We present an agentbased stock market simulation in which traders utilise a hybrid mixture of common information criteria based inference procedures, including minimum message length (MML) inference. Traders in our model compete with each other using a range of different inference techniques to inf ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We present an agentbased stock market simulation in which traders utilise a hybrid mixture of common information criteria based inference procedures, including minimum message length (MML) inference. Traders in our model compete with each other using a range of different inference techniques to infer the parameters and appropriate order of simple autoregressive (AR) models of stock price evolution. We show that such traders are initially profitable while a significant population of random traders exist, and that MML inference traders outperform other inference traders in the presence of a noisy AR signal. 1.