Results 1  10
of
34
An Efficient Boosting Algorithm for Combining Preferences
, 1999
"... The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting ..."
Abstract

Cited by 524 (18 self)
 Add to MetaCart
The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting algorithm for combining preferences called RankBoost. We also describe an efficient implementation of the algorithm for certain natural cases. We discuss two experiments we carried out to assess the performance of RankBoost. In the first experiment, we used the algorithm to combine different WWW search strategies, each of which is a query expansion for a given domain. For this task, we compare the performance of RankBoost to the individual search strategies. The second experiment is a collaborativefiltering task for making movie recommendations. Here, we present results comparing RankBoost to nearestneighbor and regression algorithms.
Extremely Randomized Trees
 MACHINE LEARNING
, 2003
"... This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme ..."
Abstract

Cited by 133 (35 self)
 Add to MetaCart
This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy
, 2003
"... Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary ..."
Abstract

Cited by 129 (0 self)
 Add to MetaCart
Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary classifier outputs (correct or incorrect vote for the class label): four averaged pairwise measures (the Q statistic, the correlation, the disagreement and the double fault) and six nonpairwise measures (the entropy of the votes, the difficulty index, the KohaviWolpert variance, the interrater agreement, the generalized diversity, and the coincident failure diversity). Four experiments have been designed to examine the relationship between the accuracy of the team and the measures of diversity, and among the measures themselves. Although there are proven connections between diversity and accuracy in some special cases, our results raise some doubts about the usefulness of diversity measures in building classifier ensembles in reallife pattern recognition problems.
Learning and Making Decisions When Costs and Probabilities are Both Unknown
 In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining
, 2001
"... In many machine learning domains, misclassication costs are dierent for dierent examples, in the same way that class membership probabilities are exampledependent. In these domains, both costs and probabilities are unknown for test examples, so both cost estimators and probability estimators must be ..."
Abstract

Cited by 97 (9 self)
 Add to MetaCart
In many machine learning domains, misclassication costs are dierent for dierent examples, in the same way that class membership probabilities are exampledependent. In these domains, both costs and probabilities are unknown for test examples, so both cost estimators and probability estimators must be learned. This paper rst discusses how to make optimal decisions given cost and probability estimates, and then presents decision tree learning methods for obtaining wellcalibrated probability estimates. The paper then explains how to obtain unbiased estimators for exampledependent costs, taking into account the diculty that in general, probabilities and costs are not independent random variables, and the training examples for which costs are known are not representative of all examples. The latter problem is called sample selection bias in econometrics. Our solution to it is based on Nobel prizewinning work due to the economist James Heckman. We show that the methods we propose are s...
Online Bagging and Boosting
 In Artificial Intelligence and Statistics 2001
, 2001
"... Bagging and boosting are wellknown ensemble learning methods. They combine multiple learned base models with the aim of improving generalization performance. To date, they have been used primarily in batch mode, and no effective online versions have been proposed. We present simple online bagging a ..."
Abstract

Cited by 82 (1 self)
 Add to MetaCart
Bagging and boosting are wellknown ensemble learning methods. They combine multiple learned base models with the aim of improving generalization performance. To date, they have been used primarily in batch mode, and no effective online versions have been proposed. We present simple online bagging and boosting algorithms that we claim perform as well as their batch counterparts.
Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms
 Data Mining and Knowledge Discovery
, 1999
"... Scalability is a key requirement for any KDD and data mining algorithm, and one of the biggest research challenges is to develop methods that allow to use large amounts of data. One possible approach for dealing with huge amounts of data is to take a random sample and do data mining on it, since for ..."
Abstract

Cited by 47 (7 self)
 Add to MetaCart
Scalability is a key requirement for any KDD and data mining algorithm, and one of the biggest research challenges is to develop methods that allow to use large amounts of data. One possible approach for dealing with huge amounts of data is to take a random sample and do data mining on it, since for many data mining applications approximate answers are acceptable. However, as argued by several researchers, random sampling is difficult to use due to the difficulty of determining an appropriate sample size. In this paper, we take a sequential sampling approach for solving this difficulty, and propose an adaptive sampling method that solves a general problem covering many actual problems arising in applications of discovery science. An algorithm following this method obtains examples sequentially in an online fashion, and it determines from the obtained examples whether it has already seen a large enough number of examples. Thus, sample size is notfixed a priori; instead, it adaptively depends on the situation. Due to this adaptiveness, if we are not in a worst case situation as fortunately happens in many practical applications, then we can solve the problem with a number of examples much smaller than the required in the worst case. We prove the correctness of our method and estimates its efficiency theoretically. For illustrating its usefulness, we consider one concrete example of using sampling, provide an algorithm based on our method, and show its efficiency by experimental evaluation.
A Comparative Study of CostSensitive Boosting Algorithms
 In Proceedings of the 17th International Conference on Machine Learning
"... This paper describes a study of different adaptations of boosting algorithms for costsensitive classification. The purpose of the study is to improve our understanding of the behavior of various costsensitive boosting algorithms and how variations in the boosting procedure affect misclassification ..."
Abstract

Cited by 41 (2 self)
 Add to MetaCart
This paper describes a study of different adaptations of boosting algorithms for costsensitive classification. The purpose of the study is to improve our understanding of the behavior of various costsensitive boosting algorithms and how variations in the boosting procedure affect misclassification cost and high cost error. We find that boosting can be simplified for costsensitive classification. A new variant, which excludes a factor used in ordinary boosting, performs best at minimizing high cost errors and it almost always performs better than AdaBoost. We also find that costsensitive boosting seeks to minimize high cost errors rather than cost, and a minimum expected cost criterion, applied during classification, greatly enhances the performance of all costsensitive adaptations of boosting algorithms. We show a strong correlation between an algorithm that produces small model size and its success in reducing high cost errors. For a recently proposed method, AdaCost,...
WellTrained PETs: Improving Probability Estimation Trees
, 2000
"... Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probability estimates, and by how much. In this paper we first discuss why the decisiontree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decisiontree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reducederror pruning). Larger tree...
Ten Measures of Diversity in Classifier Ensembles: Limits for Two Classifiers
 In Proc. of IEE Workshop on Intelligent Sensor Processing
, 2001
"... Independence and dependence of classifier outputs have been debated in the recent literature giving rise to notions such as diversity, complementarity, orthogonality, etc. There seems to be no consensus on the meaning of these notions beyond the intuitive perception. Here we summarize 10 measures of ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
Independence and dependence of classifier outputs have been debated in the recent literature giving rise to notions such as diversity, complementarity, orthogonality, etc. There seems to be no consensus on the meaning of these notions beyond the intuitive perception. Here we summarize 10 measures of classifier diversity: 4 pairwise and 6 nonpairwise measures. We derive the limits of the measures for 2 classifiers of equal accuracy.