Results 1  10
of
790
Evaluating collaborative filtering recommender systems
 ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2004
"... ..."
An introduction to roc analysis
 Proc. Natl. Acad
, 2006
"... Receiver operating characteristics (ROC) graphs are useful for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been used increasingly in machine learning and data mining research. Although ROC graphs are appa ..."
Abstract

Cited by 704 (1 self)
 Add to MetaCart
(Show Context)
Receiver operating characteristics (ROC) graphs are useful for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been used increasingly in machine learning and data mining research. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. The purpose of this article is to serve as an introduction to ROC graphs and as a guide for using them in research.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
 Pattern Recognition
, 1997
"... AbstractIn this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multilayer Percept ..."
Abstract

Cited by 585 (3 self)
 Add to MetaCart
(Show Context)
AbstractIn this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multilayer Perceptron, kNearest Neighbours, and a Quadratic Discriminant Function) on six &quot;real world &quot; medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for &quot;single number &quot; evaluation of machine
ROC Graphs: Notes and Practical Considerations for Researchers
, 2004
"... Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communitie ..."
Abstract

Cited by 330 (1 self)
 Add to MetaCart
Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communities. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. This article serves both as a tutorial introduction to ROC graphs and as a practical guide for using them in research.
Robust Classification for Imprecise Environments
, 1989
"... In realworld environments it is usually difficult to specify target operating conditions precisely. This uncertainty makes building robust classification systems problematic. We present a method for the comparison of classifier performance that is robust to imprecise class distributions and misclas ..."
Abstract

Cited by 311 (15 self)
 Add to MetaCart
(Show Context)
In realworld environments it is usually difficult to specify target operating conditions precisely. This uncertainty makes building robust classification systems problematic. We present a method for the comparison of classifier performance that is robust to imprecise class distributions and misclassification costs. The ROC convex hull method combines techniques from ROC analysis, decision analysis and computational geometry, and adapts them to the particulars of analyzing learned classifiers. The method is efficient and incremental, minimizes the management of classifier performance data, and allows for clear visual comparisons and sensitivity analyses. We then show that it is possible to build a hybrid classifier that will perform at least as well as the best available classifier for any target conditions. This robust performance extends across a wide variety of comparison frameworks, including the optimization of metrics such as accuracy, expected cost, lift, precision, recall, and ...
Novel methods improve prediction of species’ distributions from occurrence data
 Ecography
, 2006
"... occurrence data ..."
(Show Context)
ROC graphs: Notes and practical considerations for data mining researchers
, 2003
"... Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communitie ..."
Abstract

Cited by 189 (0 self)
 Add to MetaCart
(Show Context)
Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communities. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. This article serves both as a tutorial introduction to ROC graphs and as a practical guide for using them in research. Keywords: 1
Tree Induction for Probabilitybased Ranking
, 2002
"... Tree induction is one of the most effective and widely used methods for building classification models. However, many applications require cases to be ranked by the probability of class membership. Probability estimation trees (PETs) have the same attractive features as classification trees (e.g., c ..."
Abstract

Cited by 151 (4 self)
 Add to MetaCart
Tree induction is one of the most effective and widely used methods for building classification models. However, many applications require cases to be ranked by the probability of class membership. Probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probabilitybased rankings, and by how much. In this paper we first discuss why the decisiontree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decisiontree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reducederror pruning). Larger trees can be better for probability estimation, even if the extra size is superfluous for accuracy maximization. We then present the results of a comprehensive set of experiments, testing some straghtforward methods for improving probabilitybased rankings. We show that using a simple, common smoothing methodthe Laplace correctionuniformly improves probabilitybased rankings. In addition, bagging substantioJly improves the rankings, and is even more effective for this purpose than for improving accuracy. We conclude that PETs, with these simple modifications, should be considered when rankings based on classmembership probability are required.
Data Mining for Direct Marketing: Problems and Solutions
 Proceedings of
, 1998
"... Direct marketing is a process of identifying likely buyers of certain products and promoting the products accordingly. It is increasingly used by banks, insurance companies, and the retail industry. Data mining can provide an effective tool for direct marketing. During data mining, several specific ..."
Abstract

Cited by 150 (0 self)
 Add to MetaCart
(Show Context)
Direct marketing is a process of identifying likely buyers of certain products and promoting the products accordingly. It is increasingly used by banks, insurance companies, and the retail industry. Data mining can provide an effective tool for direct marketing. During data mining, several specific problems arise. For example, the class distribution is extremely imbalanced (the response rate is about 1~), the predictive accuracy is no longer suitable for evaluating learning methods, and the number of examples can be too large. In this paper, we discuss methods of coping with these problems based on our experience on directmarketing projects using data mining. 1