Results 1  10
of
234
An Empirical Comparison of Supervised Learning Algorithms
 In Proc. 23 rd Intl. Conf. Machine learning (ICML’06
, 2006
"... A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90’s. We present a largescale empirical comparison between ten supervised learning methods: SVMs, n ..."
Abstract

Cited by 212 (6 self)
 Add to MetaCart
A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90’s. We present a largescale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memorybased learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods. 1.
Tree Induction for Probabilitybased Ranking
, 2002
"... Tree induction is one of the most effective and widely used methods for building classification models. However, many applications require cases to be ranked by the probability of class membership. Probability estimation trees (PETs) have the same attractive features as classification trees (e.g., c ..."
Abstract

Cited by 161 (4 self)
 Add to MetaCart
Tree induction is one of the most effective and widely used methods for building classification models. However, many applications require cases to be ranked by the probability of class membership. Probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probabilitybased rankings, and by how much. In this paper we first discuss why the decisiontree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decisiontree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reducederror pruning). Larger trees can be better for probability estimation, even if the extra size is superfluous for accuracy maximization. We then present the results of a comprehensive set of experiments, testing some straghtforward methods for improving probabilitybased rankings. We show that using a simple, common smoothing methodthe Laplace correctionuniformly improves probabilitybased rankings. In addition, bagging substantioJly improves the rankings, and is even more effective for this purpose than for improving accuracy. We conclude that PETs, with these simple modifications, should be considered when rankings based on classmembership probability are required.
A Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classification
 COMPUTER COMMUNICATION REVIEW
, 2006
"... The identification of network applications through observation of associated packet traffic flows is vital to the areas of network management and surveillance. Currently popular methods such as port number and payloadbased identification exhibit a number of shortfalls. An alternative is to use mach ..."
Abstract

Cited by 113 (4 self)
 Add to MetaCart
The identification of network applications through observation of associated packet traffic flows is vital to the areas of network management and surveillance. Currently popular methods such as port number and payloadbased identification exhibit a number of shortfalls. An alternative is to use machine learning (ML) techniques and identify network applications based on perflow statistics, derived from payloadindependent features such as packet length and interarrival time distributions. The performance impact of feature set reduction, using Consistencybased and Correlationbased feature selection, is demonstrated on Naïve Bayes, C4.5, Bayesian Network and Naïve Bayes Tree algorithms. We then show that it is useful to differentiate algorithms based on computational performance rather than classification accuracy alone, as although classification accuracy between the algorithms is similar, computational performance can differ significantly.
Tree induction vs. logistic regression: A learningcurve analysis
 CEDER WORKING PAPER #IS0102, STERN SCHOOL OF BUSINESS
, 2001
"... Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership pr ..."
Abstract

Cited by 86 (16 self)
 Add to MetaCart
(Show Context)
Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership probabilities. We use a learningcurve analysis to examine the relationship of these measures to the size of the training set. The results of the study show several remarkable things. (1) Contrary to prior observations, logistic regression does not generally outperform tree induction. (2) More specifically, and not surprisingly, logistic regression is better for smaller training sets and tree induction for larger data sets. Importantly, this often holds for training sets drawn from the same domain (i.e., the learning curves cross), so conclusions about inductionalgorithm superiority on a given domain must be based on an analysis of the learning curves. (3) Contrary to conventional wisdom, tree induction is effective atproducing probabilitybased rankings, although apparently comparatively less so foragiven training{set size than at making classifications. Finally, (4) the domains on which tree induction and logistic regression are ultimately preferable canbecharacterized surprisingly well by a simple measure of signaltonoise ratio.
Classification trees with unbiased multiway splits
 Journal of the American Statistical Association
, 2001
"... Two univariate split methods and one linear combination split method are proposed for the construction of classification trees with multiway splits. Examples are given where the trees are more compact and hence easier to interpret than binary trees. A major strength of the univariate split methods i ..."
Abstract

Cited by 74 (11 self)
 Add to MetaCart
(Show Context)
Two univariate split methods and one linear combination split method are proposed for the construction of classification trees with multiway splits. Examples are given where the trees are more compact and hence easier to interpret than binary trees. A major strength of the univariate split methods is that they have negligible bias in variable selection, both when the variables differ in the number of splits they offer and when they differ in number of missing values. This is an advantage because inferences from the tree structures can be adversely affected by selection bias. The new methods are shown to be highly competitive in terms of computational speed and classification accuracy of future observations. Key words and phrases: Decision tree, linear discriminant analysis, missing value, selection bias. 1
Revisiting the foundations of Artificial Immune Systems: a problemoriented perspective
 Hart (Eds.) Artificial Immune Systems (Proc. ICARIS2003), LNCS 2787
, 2003
"... This paper advocates a problemoriented approach for the design of Artificial Immune Systems (AIS) for data mining. By problemoriented approach we mean that, in realworld data mining applications, the design of an AIS should take into account the characteristics of the data to be mined together wi ..."
Abstract

Cited by 62 (26 self)
 Add to MetaCart
This paper advocates a problemoriented approach for the design of Artificial Immune Systems (AIS) for data mining. By problemoriented approach we mean that, in realworld data mining applications, the design of an AIS should take into account the characteristics of the data to be mined together with the application domain: the components of the AIS – such as its representation, affinity function and immune process – should be tailored for the data and the application. This is in contrast with the majority of the literature, where a very generic AIS algorithm for data mining is developed and there is little or no concern in tailoring the components of the AIS for the data to be mined or the application domain. To support this problemoriented approach, we provide an extensive critical review of the current literature on AIS for data mining, focusing on the data mining tasks of classification and anomaly detection. We discuss several important lessons to be taken from the natural immune system to design new AIS that are considerably more adaptive than current AIS. Finally, we conclude the paper with a summary of seven limitations of current AIS for data mining and 10 suggested research directions.
Toward Intelligent Assistance for a Data Mining Process: An OntologyBased Approach for CostSensitive Classification
 IEEE Transactions on Knowledge and Data Engineering
, 2005
"... For more information, please visit our website at ..."
(Show Context)
WellTrained PETs: Improving Probability Estimation Trees
, 2000
"... Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in ..."
Abstract

Cited by 53 (6 self)
 Add to MetaCart
Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probability estimates, and by how much. In this paper we first discuss why the decisiontree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decisiontree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reducederror pruning). Larger tree...
Top–Down Induction of Decision Trees Classifiers–A survey
 Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
, 2005
"... Abstract—Decision trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This pape ..."
Abstract

Cited by 52 (4 self)
 Add to MetaCart
(Show Context)
Abstract—Decision trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This paper presents an updated survey of current methods for constructing decision tree classifiers in a topdown manner. The paper suggests a unified algorithmic framework for presenting these algorithms and describes the various splitting criteria and pruning methodologies. Index Terms—Classification, decision trees, pruning methods, splitting criteria. I.
Representing classification problems in genetic programming
 In Proceedings of the 2001 congress on evolutionary computation, Seoul, Seoul, Korea
, 2001
"... AbstractIn this paper five alternative methods are proposed to perform multiclass classification tasks using genetic programming. These methods are: Binary decomposition, in which the problem is decomposed into a set of binary problems and standard genetic programming methods are applied; Static ..."
Abstract

Cited by 50 (4 self)
 Add to MetaCart
(Show Context)
AbstractIn this paper five alternative methods are proposed to perform multiclass classification tasks using genetic programming. These methods are: Binary decomposition, in which the problem is decomposed into a set of binary problems and standard genetic programming methods are applied; Static range selection, where the set of real values returned by a genetic program is divided into class boundaries using arbitrarily chosen division points; Dynamic range selection in which a subset of training examples are used to determine where, over the set of reals, class boundaries lie; Class enumeration which constructs programs similar in syntactic structure to a decision tree; and evidence accumulation which allows separate branches of the program to add to the certainty of any given class. Results showed that the dynamic range selection method was well suited to the task of multiclass classification and was capable of producing classifiers more accurate than the other methods tried when comparable training times were allowed. Accuracy of the generated classifiers was comparable to alternative approaches over several datasets.