Learning when Training Data are Costly: The Effect of Class Distribution on Tree Induction (2002)

by Gary M. Weiss , Foster Provost
Citations:109 - 9 self

Documents Related by Co-Citation

301 SMOTE: Synthetic Minority Over-sampling Technique – Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, W. Philip Kegelmeyer - 2002
77 C4.5, Class Imbalance, and Cost Sensitivity: Why Under-sampling beats Over-sampling – Chris Drummond, Robert C. Holte - 2003
301 MetaCost: A General Method for Making Classifiers Cost-Sensitive – Pedro Domingos - 1999
182 The class imbalance problem: A systematic study – N Japkowicz, S Stephen
267 The Foundations of Cost-Sensitive Learning – Charles Elkan - 2001
111 Mining with Rarity: A Unifying Framework – Gary M. Weiss
106 Cost-Sensitive Learning by Cost-Proportionate Example Weighting – Bianca Zadrozny, John Langford, Naoki Abe - 2003
121 Toward Scalable Learning with Non-uniform Distributions: Effects and a Multi-classifier Approach – Philip Chan, Salvatore J. Stolfo - 1999
4905 C4.5: Programs for Machine Learning – J R Quinlan - 1993
91 A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data – Gustavo E. A. P. A. Batista, Ronaldo C. Prati, Maria Carolina Monard - 2004
45 Evaluating Boosting Algorithms to Classify Rare Classes: Comparison And Improvements – Mahesh V. Joshi, Vipin Kumar, Ramesh C. Agarwal - 2001
46 Learning when data sets are imbalanced and when costs are unequal and unknown – Marcus A. Maloof - 2003
30 C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure – Nitesh V. Chawla - 2003
40 Learning from imbalanced data sets with boosting and data generation: The DataBoost-IM approach – Hongyu Guo - 2004
261 Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions – Foster Provost, Tom Fawcett - 1997
36 Extreme rebalancing for svms: a case study – B Raskutti, A Kowalczyk
58 SMOTEBoost: improving prediction of the minority class in boosting – Nitesh V. Chawla, Ar Lazarevic, Lawrence O. Hall, Kevin W. Bowyer - 2003
436 The use of the area under the ROC curve in the evaluation of machine learning algorithms – Andrew P. Bradley - 1997
154 Addressing the Curse of Imbalanced Training Sets: One-Sided Selection – Miroslav Kubat, Stan Matwin - 1997