Results 1  10
of
199
The Random Subspace Method for Constructing Decision Forests
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... Much of previous attention on decision trees focuses on the splitting criteria and optimization of tree sizes. The dilemma between overfitting and achieving maximum accuracy is seldom resolved. We propose a method to construct a decision tree based classifier that maintains highest accuracy on train ..."
Abstract

Cited by 351 (9 self)
 Add to MetaCart
Much of previous attention on decision trees focuses on the splitting criteria and optimization of tree sizes. The dilemma between overfitting and achieving maximum accuracy is seldom resolved. We propose a method to construct a decision tree based classifier that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifier consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces. The subspace method is compared to singletree classifiers and other forest construction methods by experiments on publicly available datasets, where the method's superiority is demonstrated. We also discuss independence between trees in a forest and relate that to the combined classification accuracy. keywords: pattern recognition, decision tree, decision forest, stochastic discrimination, decision combination, classif...
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirtythree Old and New Classification Algorithms
, 2000
"... . Twentytwo decision tree, nine statistical, and two neural network algorithms are compared on thirtytwo datasets in terms of classication accuracy, training time, and (in the case of trees) number of leaves. Classication accuracy is measured by mean error rate and mean rank of error rate. Both cr ..."
Abstract

Cited by 167 (7 self)
 Add to MetaCart
. Twentytwo decision tree, nine statistical, and two neural network algorithms are compared on thirtytwo datasets in terms of classication accuracy, training time, and (in the case of trees) number of leaves. Classication accuracy is measured by mean error rate and mean rank of error rate. Both criteria place a statistical, splinebased, algorithm called Polyclass at the top, although it is not statistically signicantly dierent from twenty other algorithms. Another statistical algorithm, logistic regression, is second with respect to the two accuracy criteria. The most accurate decision tree algorithm is Quest with linear splits, which ranks fourth and fth, respectively. Although splinebased statistical algorithms tend to have good accuracy, they also require relatively long training times. Polyclass, for example, is third last in terms of median training time. It often requires hours of training compared to seconds for other algorithms. The Quest and logistic regression algor...
Automatic Construction of Decision Trees from Data: A MultiDisciplinary Survey
 Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract

Cited by 146 (1 self)
 Add to MetaCart
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, treestructured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
Split Selection Methods for Classification Trees
 STATISTICA SINICA
, 1997
"... Classification trees based on exhaustive search algorithms tend to be biased towards selecting variables that afford more splits. As a result, such trees should be interpreted with caution. This article presents an algorithm called QUEST that has negligible bias. Its split selection strategy shares ..."
Abstract

Cited by 75 (9 self)
 Add to MetaCart
Classification trees based on exhaustive search algorithms tend to be biased towards selecting variables that afford more splits. As a result, such trees should be interpreted with caution. This article presents an algorithm called QUEST that has negligible bias. Its split selection strategy shares similarities with the FACT method, but it yields binary splits and the final tree can be selected by a direct stopping rule or by pruning. Real and simulated data are used to compare QUEST with the exhaustive search approach. QUEST is shown to be substantially faster and the size and classification accuracy of its trees are typically comparable to those of exhaustive search.
Extracting Conserved Gene Expression Motifs From Gene Expression Data
 Pac. Symp. Biocomput
, 2003
"... We propose a representation for gene expression data called conserved gene expression motifs or xmotifs. A gene's expression level is conserved across a set of samples if the gene is expressed with the same abundance in all the samples. A conserved gene expression motif is a subset of genes that ..."
Abstract

Cited by 61 (2 self)
 Add to MetaCart
We propose a representation for gene expression data called conserved gene expression motifs or xmotifs. A gene's expression level is conserved across a set of samples if the gene is expressed with the same abundance in all the samples. A conserved gene expression motif is a subset of genes that is simultaneously conserved across a subset of samples.
Scalparc: A new scalable and efficient parallel classification algorithm for mining large datasets
 In In Proc. of the International Parallel Processing Symposium,1998. Copyright
"... In this paper, we present ScalParC (Scalable Parallel Classifier), a new parallel formulation of a decision tree based classification process. Like other stateoftheart decision tree classifiers such as SPRINT, ScalParC is suited for handling large datasets. We show that existing parallel formulat ..."
Abstract

Cited by 60 (5 self)
 Add to MetaCart
In this paper, we present ScalParC (Scalable Parallel Classifier), a new parallel formulation of a decision tree based classification process. Like other stateoftheart decision tree classifiers such as SPRINT, ScalParC is suited for handling large datasets. We show that existing parallel formulation of SPRINT is unscalable, whereas ScalParC is shown to be scalable in both runtime and memory requirements. We present the experimental results of classifying up to 6.4 million records on up to 128 processors of Cray T3D, in order to demonstrate the scalable behavior of ScalParC. A key component of ScalParC is the parallel hash table. The proposed parallel hashing paradigm can be used to parallelize other algorithms that require many concurrent updates to a large hash table. 1
Lookahead and Pathology in Decision Tree Induction
 Proceedings of the 14th International Joint Conference on Artificial Intelligence
, 1995
"... The standard approach to decision tree induction is a topdown, greedy algorithm that makes locally optimal, irrevocable decisions at each node of a tree. In this paper, we study an alternative approach, in which the algorithms use limited lookahead to decide what test to use at a node. We systemati ..."
Abstract

Cited by 52 (2 self)
 Add to MetaCart
The standard approach to decision tree induction is a topdown, greedy algorithm that makes locally optimal, irrevocable decisions at each node of a tree. In this paper, we study an alternative approach, in which the algorithms use limited lookahead to decide what test to use at a node. We systematically compare, using a very large number of decision trees, the quality of decision trees induced by the greedy approach to that of trees induced using lookahead. The main results of our experiments are: (i) the greedy approach produces trees that are just as accurate as trees produced with the much more expensive lookahead step; and (ii) decision tree induction exhibits pathology, in the sense that lookahead can produce trees that are both larger and less accurate than trees produced without it. 1. Introduction The standard algorithm for constructing decision trees from a set of examples is greedy induction  a tree is induced topdown with locally optimal choices made at each node, with...
An Implementation of Logical Analysis of Data
 IEEE Transactions on Knowledge and Data Engineering
, 2000
"... The paper describes a new, logicbased methodology for analyzing observations. The key features of the Logical Analysis of Data (LAD) are the discovery of minimal sets of features necessary for explaining all observations and the detection of hidden patterns in the data capable of distinguishing o ..."
Abstract

Cited by 47 (25 self)
 Add to MetaCart
The paper describes a new, logicbased methodology for analyzing observations. The key features of the Logical Analysis of Data (LAD) are the discovery of minimal sets of features necessary for explaining all observations and the detection of hidden patterns in the data capable of distinguishing observations describing positive outcome events from negative outcome events. Combinations of such patterns are used for developing general classification procedures. An implementation of this methodology is described in the paper along with the results of numerical experiments demonstrating the classification performance of LAD in comparison with the reported results of other procedures. In the final section, we describe three pilot studies on applications of LAD to oil exploration, psychometric testing, and the analysis of developments in the Chinese transitional economy. These pilot studies demonstrate not only the classification power of LAD, but also its flexibility and capability t...
Classification trees with unbiased multiway splits
 Journal of the American Statistical Association
, 2001
"... Two univariate split methods and one linear combination split method are proposed for the construction of classification trees with multiway splits. Examples are given where the trees are more compact and hence easier to interpret than binary trees. A major strength of the univariate split methods i ..."
Abstract

Cited by 42 (8 self)
 Add to MetaCart
Two univariate split methods and one linear combination split method are proposed for the construction of classification trees with multiway splits. Examples are given where the trees are more compact and hence easier to interpret than binary trees. A major strength of the univariate split methods is that they have negligible bias in variable selection, both when the variables differ in the number of splits they offer and when they differ in number of missing values. This is an advantage because inferences from the tree structures can be adversely affected by selection bias. The new methods are shown to be highly competitive in terms of computational speed and classification accuracy of future observations. Key words and phrases: Decision tree, linear discriminant analysis, missing value, selection bias. 1
Local Cascade Generalization
, 1998
"... In a previous work we have presented Cascade Generalization, a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the original data by the insertion of new attributes. The new att ..."
Abstract

Cited by 39 (1 self)
 Add to MetaCart
In a previous work we have presented Cascade Generalization, a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the original data by the insertion of new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. In this paper we extend this work by applying Cascade locally. At each iteration of a divide and conquer algorithm, a reconstruction of the instance space occurs by the addition of new attributes. Each new attribute represents the probability that an example belongs to a class given by a base classifier. We have implemented three Local Generalization Algorithms. The first merges a linear discriminant with a decision tree, the second merges a naive Bayes with a decision tree, and the third mer...