• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms (2000)

by T Lim, W Loh, Y Shih
Venue:Machine Learning
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 234
Next 10 →

An Empirical Comparison of Supervised Learning Algorithms

by Rich Caruana, Alexandru Niculescu-mizil - In Proc. 23 rd Intl. Conf. Machine learning (ICML’06 , 2006
"... A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90’s. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, n ..."
Abstract - Cited by 212 (6 self) - Add to MetaCart
A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90’s. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods. 1.

Tree Induction for Probability-based Ranking

by Foster Provost , Pedro Domingos , 2002
"... Tree induction is one of the most effective and widely used methods for building classification models. However, many applications require cases to be ranked by the probability of class membership. Probability estimation trees (PETs) have the same attractive features as classification trees (e.g., c ..."
Abstract - Cited by 161 (4 self) - Add to MetaCart
Tree induction is one of the most effective and widely used methods for building classification models. However, many applications require cases to be ranked by the probability of class membership. Probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probability-based rankings, and by how much. In this paper we first discuss why the decision-tree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decision-tree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reduced-error pruning). Larger trees can be better for probability estimation, even if the extra size is superfluous for accuracy maximization. We then present the results of a comprehensive set of experiments, testing some straghtforward methods for improving probability-based rankings. We show that using a simple, common smoothing method--the Laplace correction--uniformly improves probability-based rankings. In addition, bagging substantioJly improves the rankings, and is even more effective for this purpose than for improving accuracy. We conclude that PETs, with these simple modifications, should be considered when rankings based on class-membership probability are required.

A Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classification

by Nigel Williams, Sebastian Zander, Grenville Armitage - COMPUTER COMMUNICATION REVIEW , 2006
"... The identification of network applications through observation of associated packet traffic flows is vital to the areas of network management and surveillance. Currently popular methods such as port number and payload-based identification exhibit a number of shortfalls. An alternative is to use mach ..."
Abstract - Cited by 113 (4 self) - Add to MetaCart
The identification of network applications through observation of associated packet traffic flows is vital to the areas of network management and surveillance. Currently popular methods such as port number and payload-based identification exhibit a number of shortfalls. An alternative is to use machine learning (ML) techniques and identify network applications based on per-flow statistics, derived from payload-independent features such as packet length and inter-arrival time distributions. The performance impact of feature set reduction, using Consistencybased and Correlation-based feature selection, is demonstrated on Naïve Bayes, C4.5, Bayesian Network and Naïve Bayes Tree algorithms. We then show that it is useful to differentiate algorithms based on computational performance rather than classification accuracy alone, as although classification accuracy between the algorithms is similar, computational performance can differ significantly.

Tree induction vs. logistic regression: A learning-curve analysis

by Claudia Perlich, Foster Provost, Jeffrey S. Simonoff - CEDER WORKING PAPER #IS-01-02, STERN SCHOOL OF BUSINESS , 2001
"... Tree induction and logistic regression are two standard, off-the-shelf methods for building models for classi cation. We present a large-scale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on class-membership pr ..."
Abstract - Cited by 86 (16 self) - Add to MetaCart
Tree induction and logistic regression are two standard, off-the-shelf methods for building models for classi cation. We present a large-scale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on class-membership probabilities. We use a learning-curve analysis to examine the relationship of these measures to the size of the training set. The results of the study show several remarkable things. (1) Contrary to prior observations, logistic regression does not generally outperform tree induction. (2) More specifically, and not surprisingly, logistic regression is better for smaller training sets and tree induction for larger data sets. Importantly, this often holds for training sets drawn from the same domain (i.e., the learning curves cross), so conclusions about induction-algorithm superiority on a given domain must be based on an analysis of the learning curves. (3) Contrary to conventional wisdom, tree induction is effective atproducing probability-based rankings, although apparently comparatively less so foragiven training{set size than at making classifications. Finally, (4) the domains on which tree induction and logistic regression are ultimately preferable canbecharacterized surprisingly well by a simple measure of signal-to-noise ratio.
(Show Context)

Citation Context

...f probability estimation trees, model selection applied to logistic regression, biased (\ridge") 1 In fact, logistic regression has been shown to be extremely competitive with other learning methods (=-=Lim, Loh, and Shih, 2000-=-), as we discuss in detail. 2logistic regression, and bagging applied to both methods. 3. To compare the learning curves of the di erent types of algorithm, in order to explore the relationship betwe...

Classification trees with unbiased multiway splits

by Hyunjoong Kim, Wei-yin Loh - Journal of the American Statistical Association , 2001
"... Two univariate split methods and one linear combination split method are proposed for the construction of classification trees with multiway splits. Examples are given where the trees are more compact and hence easier to interpret than binary trees. A major strength of the univariate split methods i ..."
Abstract - Cited by 74 (11 self) - Add to MetaCart
Two univariate split methods and one linear combination split method are proposed for the construction of classification trees with multiway splits. Examples are given where the trees are more compact and hence easier to interpret than binary trees. A major strength of the univariate split methods is that they have negligible bias in variable selection, both when the variables differ in the number of splits they offer and when they differ in number of missing values. This is an advantage because inferences from the tree structures can be adversely affected by selection bias. The new methods are shown to be highly competitive in terms of computational speed and classification accuracy of future observations. Key words and phrases: Decision tree, linear discriminant analysis, missing value, selection bias. 1
(Show Context)

Citation Context

.... CRUISE has the following desirable properties. 1. Its trees often have prediction accuracy at least as high as those of CART and QUEST, two highly accurate algorithms according to a 5recent study (=-=Lim et al., 2000-=-). 2. It has fast computation speed. Because it employs multiway splits, this precludes the use of greedy search methods. 3. It is practically free of selection bias. QUEST has little bias when the le...

Revisiting the foundations of Artificial Immune Systems: a problem-oriented perspective

by Alex A. Freitas, Jon Timmis - Hart (Eds.) Artificial Immune Systems (Proc. ICARIS-2003), LNCS 2787 , 2003
"... This paper advocates a problem-oriented approach for the design of Artificial Immune Systems (AIS) for data mining. By problem-oriented approach we mean that, in real-world data mining applications, the design of an AIS should take into account the characteristics of the data to be mined together wi ..."
Abstract - Cited by 62 (26 self) - Add to MetaCart
This paper advocates a problem-oriented approach for the design of Artificial Immune Systems (AIS) for data mining. By problem-oriented approach we mean that, in real-world data mining applications, the design of an AIS should take into account the characteristics of the data to be mined together with the application domain: the components of the AIS – such as its representation, affinity function and immune process – should be tailored for the data and the application. This is in contrast with the majority of the literature, where a very generic AIS algorithm for data mining is developed and there is little or no concern in tailoring the components of the AIS for the data to be mined or the application domain. To support this problem-oriented approach, we provide an extensive critical review of the current literature on AIS for data mining, focusing on the data mining tasks of classification and anomaly detection. We discuss several important lessons to be taken from the natural immune system to design new AIS that are considerably more adaptive than current AIS. Finally, we conclude the paper with a summary of seven limitations of current AIS for data mining and 10 suggested research directions.

Toward Intelligent Assistance for a Data Mining Process: An Ontology-Based Approach for Cost-Sensitive Classification

by Abraham Bernstein, Foster Provost, Abraham Bernstein, Foster Provost, Shawndra Hill - IEEE Transactions on Knowledge and Data Engineering , 2005
"... For more information, please visit our website at ..."
Abstract - Cited by 55 (3 self) - Add to MetaCart
For more information, please visit our website at
(Show Context)

Citation Context

... processes? Indeed, it suggests processes that use fast induction algorithms, such as C4.5 (shown to be very fast for memory-resident data, as compared to a wide variety of other induction algorithms =-=[13]-=-). It also produces suggestions not commonly considered [14]. For example, the enumeration contains plans that use discretization as a preprocess. Research has shown that discretization as a preproces...

Well-Trained PETs: Improving Probability Estimation Trees

by Foster Provost, Pedro Domingos , 2000
"... Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in ..."
Abstract - Cited by 53 (6 self) - Add to MetaCart
Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probability estimates, and by how much. In this paper we first discuss why the decision-tree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decision-tree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reduced-error pruning). Larger tree...

Top–Down Induction of Decision Trees Classifiers–A survey

by Lior Rokach, Oded Maimon - Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on , 2005
"... Abstract—Decision trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This pape ..."
Abstract - Cited by 52 (4 self) - Add to MetaCart
Abstract—Decision trees are considered to be one of the most popular approaches for representing classifiers. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining considered the issue of growing a decision tree from available data. This paper presents an updated survey of current methods for constructing decision tree classifiers in a top-down manner. The paper suggests a unified algorithmic framework for presenting these algorithms and describes the various splitting criteria and pruning methodologies. Index Terms—Classification, decision trees, pruning methods, splitting criteria. I.
(Show Context)

Citation Context

...ed in this table. Nevertheless most of these algorithms are variation of the algorithmic framework presented above. A profound comparison of the above algorithms and many others has been conducted in =-=[72]-=-. 484 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 35, NO. 4, NOVEMBER 2005 TABLE I ADDITIONAL DECISION TREES INDUCERS XI. ADVANTAGES AND DISADVANTAGES OF ...

Representing classification problems in genetic programming

by Thomas Loveard , Victor Ciesielski - In Proceedings of the 2001 congress on evolutionary computation, Seoul, Seoul, Korea , 2001
"... Abstract-In this paper five alternative methods are proposed to perform multi-class classification tasks using genetic programming. These methods are: Binary decomposition, in which the problem is decomposed into a set of binary problems and standard genetic programming methods are applied; Static ..."
Abstract - Cited by 50 (4 self) - Add to MetaCart
Abstract-In this paper five alternative methods are proposed to perform multi-class classification tasks using genetic programming. These methods are: Binary decomposition, in which the problem is decomposed into a set of binary problems and standard genetic programming methods are applied; Static range selection, where the set of real values returned by a genetic program is divided into class boundaries using arbitrarily chosen division points; Dynamic range selection in which a subset of training examples are used to determine where, over the set of reals, class boundaries lie; Class enumeration which constructs programs similar in syntactic structure to a decision tree; and evidence accumulation which allows separate branches of the program to add to the certainty of any given class. Results showed that the dynamic range selection method was well suited to the task of multi-class classification and was capable of producing classifiers more accurate than the other methods tried when comparable training times were allowed. Accuracy of the generated classifiers was comparable to alternative approaches over several datasets.
(Show Context)

Citation Context

...ss was chosen. Because of these factors the GP method is seen to be applicable to tasks where accuracy is the most important factor in classification, and training times and understandability are seen as relatively unimportant. 2 The Datasets A set of six datasets were chosen from the UCI Machine Learning repository [1]. These datasets were chosen because they show variety in their domain, size and in the difficulty of classification. They also vary in the number of target classes for classification. All datasets were comprised of numeric or binary attributes. These datasets were also used in [7] and therefore allow direct comparison of results of GP classifiers to results of well known classification methods. The data sets are as follows: 1. Wisconsin Breast Cancer [8] (W.B.C): Consists of 2 classes and 10 numerical attributes with 699 instances. Sixteen instances containing missing values were removed for the purposes of classification in this investigation. Error rates were estimated using ten fold cross validation. 2. BUPA Liver Disorders (BUPA): Consists of 2 classes and 6 numerical attributes with 345 instances. Error rates were estimated using ten fold cross validation. 3. Pima...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University