Results 1 
7 of
7
ROC Graphs: Notes and Practical Considerations for Researchers
, 2004
"... Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communitie ..."
Abstract

Cited by 227 (1 self)
 Add to MetaCart
Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communities. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. This article serves both as a tutorial introduction to ROC graphs and as a practical guide for using them in research.
Learning when Training Data are Costly: The Effect of Class Distribution on Tree Induction
, 2002
"... For large, realworld inductive learning problems, the number of training examples often must be limited due to the costs associated with procuring, preparing, and storing the data and/or the computational costs associated with learning from the data. One question of practical importance is: if n ..."
Abstract

Cited by 109 (9 self)
 Add to MetaCart
For large, realworld inductive learning problems, the number of training examples often must be limited due to the costs associated with procuring, preparing, and storing the data and/or the computational costs associated with learning from the data. One question of practical importance is: if n training examples are going to be selected, in what proportion should the classes be represented? In this article we analyze the relationship between the marginal class distribution of training data and the performance of classification trees induced from these data, when the size of the training set is fixed. We study twentysix data sets and, for each, determine the best class distribution for learning. Our results show that, for a fixed number of training examples, it is often possible to obtain improved classifier performance by training with a class distribution other than the naturally occurring class distribution. For example, we show that to build a classifier robust to different misclassification costs, a balanced class distribution generally performs quite well. We also describe and evaluate a budgetsensitive progressivesampling algorithm that selects training examples such that the resulting training set has a good (nearoptimal) class distribution for learning.
The Effect of Class Distribution on Classifier Learning: An Empirical Study
, 2001
"... In this article we analyze the effect of class distribution on classifier learning. We begin by describing the different ways in which class distribution affects learning and how it affects the evaluation of learned classifiers. We then present the results of two comprehensive experimental studie ..."
Abstract

Cited by 82 (2 self)
 Add to MetaCart
In this article we analyze the effect of class distribution on classifier learning. We begin by describing the different ways in which class distribution affects learning and how it affects the evaluation of learned classifiers. We then present the results of two comprehensive experimental studies. The first study compares the performance of classifiers generated from unbalanced data sets with the performance of classifiers generated from balanced versions of the same data sets. This comparison allows us to isolate and quantify the effect that the training set's class distribution has on learning and contrast the performance of the classifiers on the minority and majority classes. The second study assesses what distribution is "best" for training, with respect to two performance measures: classification accuracy and the area under the ROC curve (AUC). A tacit assumption behind much research on classifier induction is that the class distribution of the training data should match the "natural" distribution of the data. This study shows that the naturally occurring class distribution often is not best for learning, and often substantially better performance can be obtained by using a different class distribution. Understanding how classifier performance is affected by class distribution can help practitioners to choose training datain realworld situations the number of training examples often must be limited due to computational costs or the costs associated with procuring and preparing the data. 1.
The Effect Of Small Disjuncts And Class Distribution On Decision Tree Learning
 RUTGERS UNIVERSITY
, 2003
"... The main goal of classifier learning is to generate a model that makes few misclassification errors. Given this emphasis on error minimization, it makes sense to try to understand how the induction process gives rise to classifiers that make errors and whether we can identify those parts of the cla ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
The main goal of classifier learning is to generate a model that makes few misclassification errors. Given this emphasis on error minimization, it makes sense to try to understand how the induction process gives rise to classifiers that make errors and whether we can identify those parts of the classifier that generate most of the errors. In this thesis we provide the first comprehensive studies of two major sources of classification errors. The first study concerns small disjuncts, which are those disjuncts within a classifier that cover only a few training examples. An analysis of classifiers induced from thirty data sets shows that these small disjuncts are extremely error prone and often account for the majority of all classification errors. Because small disjuncts largely determine classifier performance, we use them as a "lens" through which to study classifier induction. Factors such as pruning, trainingset size, noise and class imbalance are each analyzed to determine how they affect small disjuncts and, more generally, classifier learning. The second
8 Feature Article: A Study of the Influence of Rule Measures in Classifiers Induced by Evolutionary Algorithms A Study of the Influence of Rule Measures in Classifiers Induced by Evolutionary Algorithms
"... Abstract—The Pittsburgh representation is a wellknown encoding for symbolic classifiers in evolutionary algorithms, where each individual represents one symbolic classifier, and each symbolic classifier is composed by a rule set. These rule sets can be interpreted as ordered or unordered sets. The ..."
Abstract
 Add to MetaCart
Abstract—The Pittsburgh representation is a wellknown encoding for symbolic classifiers in evolutionary algorithms, where each individual represents one symbolic classifier, and each symbolic classifier is composed by a rule set. These rule sets can be interpreted as ordered or unordered sets. The major difference between these two approaches is whether rule ordering defines a rule precedence relationship or not. Although ordered rule sets are simple to implement in a computer system, the rule set is difficult to be interpreted by human domain experts, since rules are not independent from each other. In contrast, unordered rule sets are more flexible regarding their interpretation. Rules are independent from each other and can be individually presented to a human domain expert. However, the algorithm to decide a classification of a given example is more complex. As rules have no precedence, an example should be presented to all rules at once and some criteria should be established to decide the final classification based on all fired rules. A simple approach to decide which rule should provide the final classification is to select the rule that has the best rating according to a chosen quality measure. Dozens of measures were proposed in literature; however, it is not clear whether any of them would provide a better classification performance. This work performs a comparative study of rule performance measures for unordered symbolic classifiers induced by evolutionary algorithms. We compare 9 rule quality measures in 10 data sets. Our experiments point out that confidence (also known as precision) presented the best mean results, although most of the rule quality measures presented approximated classification performance assessed with the area under the ROC curve (AUC). Index Terms—Symbolic classification, evolutionary algorithm, rule quality measures. I.
Methodological issues in the development of automatic systems
, 2006
"... www.elsevier.com/locate/bspc ..."