MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Maximizing the Area under the ROC Curve with Decision Lists and Rule Sets

by Unknown Authors
Add To MetaCart

Abstract:

Decision lists (or ordered rule sets) have two attractive properties compared to unordered rule sets: they require a simpler classification procedure and they allow for a more compact representation. However, it is an open question what effect these properties have on the area under the ROC curve (AUC). Two ways of forming decision lists are considered in this study: by generating a sequence of rules, with a default rule for one of the classes, and by imposing an order upon rules that have been generated for all classes. An empirical investigation shows that the latter method gives a significantly higher AUC than the former, demonstrating that the compactness obtained by using one of the classes as a default is indeed associated with a cost. Furthermore, by using all applicable rules rather than the first in an ordered set, an even further significant improvement in AUC is obtained, demonstrating that the simple classification procedure is also associated with a cost. The observed gains in AUC for unordered rule sets compared to decision lists can be explained by that learning rules for all classes as well as combining multiple rules allow for examples to be ranked according to a more fine-grained scale compared to when applying rules in a fixed order and providing a default rule for one of the classes. 1

Citations

2244 UCI Repository of machine learning databases. http://www.ics.uci.edu/˜mlearn/MLRepository.html – Merz, Murphy - 1998
662 Fast Effective Rule Induction – Cohen - 1995
621 The CN2 induction algorithm – Clark, Niblett - 1989
312 Learning decision lists – Rivest - 1987
252 Rule induction with CN2: some recent improvements – Clark, Boswell - 1991
240 Use of the area under the ROC curve in the evaluation of machine learning algorithms – Bradley - 1997
218 The case against accuracy estimation for comparing induction algorithms – Provost, Fawcett, et al. - 1998
94 Generating accurate rule sets without global optimization – Frank, Witten - 1998
87 Roc graphs: Notes and practical considerations for data mining researchers – Fawcett - 2003
84 Incremental reduced error pruning – Fürnkranz, Widmer - 1994
83 Separate-and-conquer rule learning – Fürnkranz - 1999
71 Tree induction for probability-based ranking – Provost, Domingos - 2009
48 On estimating probabilities in tree pruning – Cestnik, Bratko - 1991
28 2001, ‘Using Rule Sets to Maximize ROC Performance – Fawcett
23 ROC ’n’ rule learning – towards a better understanding of covering algorithms – Fürnkranz, Flach
20 Subgroup discovery with cn2-sd – Lavrač, Kavˇsek, et al. - 1999
19 Decision Tree with Better Ranking – Ling, Yan - 2003
9 Improving the AUC of Probabilistic Estimation Trees – Ferri, Flach, et al. - 2003
4 Resolving rule conflicts with double induction – Lindgren, Boström - 2004
3 Irep++, a faster rule learning algorithm – Dain, Cunningham, et al. - 2004
3 Roccer: A roc convex hull rule learning algorithm – Prati, Flach - 2004
2 Pruning and exclusion criteria for unordered incremental reduced error pruning – Boström - 2004
2 Classification with intersecting rules – Lindgren, Boström - 2002