Results 1 - 10
of
34
An Efficient Boosting Algorithm for Combining Preferences
, 1999
"... The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting ..."
Abstract
-
Cited by 383 (13 self)
- Add to MetaCart
The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting algorithm for combining preferences called RankBoost. We also describe an efficient implementation of the algorithm for certain natural cases. We discuss two experiments we carried out to assess the performance of RankBoost. In the first experiment, we used the algorithm to combine different WWW search strategies, each of which is a query expansion for a given domain. For this task, we compare the performance of RankBoost to the individual search strategies. The second experiment is a collaborative-filtering task for making movie recommendations. Here, we present results comparing RankBoost to nearest-neighbor and regression algorithms.
Extremely Randomized Trees
- MACHINE LEARNING
, 2003
"... This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme ..."
Abstract
-
Cited by 88 (30 self)
- Add to MetaCart
This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy
, 2003
"... Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary ..."
Abstract
-
Cited by 81 (0 self)
- Add to MetaCart
Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary classifier outputs (correct or incorrect vote for the class label): four averaged pairwise measures (the Q statistic, the correlation, the disagreement and the double fault) and six non-pairwise measures (the entropy of the votes, the difficulty index, the Kohavi-Wolpert variance, the interrater agreement, the generalized diversity, and the coincident failure diversity). Four experiments have been designed to examine the relationship between the accuracy of the team and the measures of diversity, and among the measures themselves. Although there are proven connections between diversity and accuracy in some special cases, our results raise some doubts about the usefulness of diversity measures in building classifier ensembles in real-life pattern recognition problems.
Learning and Making Decisions When Costs and Probabilities are Both Unknown
- In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining
, 2001
"... In many machine learning domains, misclassication costs are dierent for dierent examples, in the same way that class membership probabilities are exampledependent. In these domains, both costs and probabilities are unknown for test examples, so both cost estimators and probability estimators must be ..."
Abstract
-
Cited by 73 (8 self)
- Add to MetaCart
In many machine learning domains, misclassication costs are dierent for dierent examples, in the same way that class membership probabilities are exampledependent. In these domains, both costs and probabilities are unknown for test examples, so both cost estimators and probability estimators must be learned. This paper rst discusses how to make optimal decisions given cost and probability estimates, and then presents decision tree learning methods for obtaining well-calibrated probability estimates. The paper then explains how to obtain unbiased estimators for example-dependent costs, taking into account the diculty that in general, probabilities and costs are not independent random variables, and the training examples for which costs are known are not representative of all examples. The latter problem is called sample selection bias in econometrics. Our solution to it is based on Nobel prize-winning work due to the economist James Heckman. We show that the methods we propose are s...
Online Bagging and Boosting
- In Artificial Intelligence and Statistics 2001
, 2001
"... Bagging and boosting are well-known ensemble learning methods. They combine multiple learned base models with the aim of improving generalization performance. To date, they have been used primarily in batch mode, and no effective online versions have been proposed. We present simple online bagging a ..."
Abstract
-
Cited by 53 (1 self)
- Add to MetaCart
Bagging and boosting are well-known ensemble learning methods. They combine multiple learned base models with the aim of improving generalization performance. To date, they have been used primarily in batch mode, and no effective online versions have been proposed. We present simple online bagging and boosting algorithms that we claim perform as well as their batch counterparts.
Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms
- Data Mining and Knowledge Discovery
, 1999
"... Scalability is a key requirement for any KDD and data mining algorithm, and one of the biggest research challenges is to develop methods that allow to use large amounts of data. One possible approach for dealing with huge amounts of data is to take a random sample and do data mining on it, since for ..."
Abstract
-
Cited by 35 (7 self)
- Add to MetaCart
Scalability is a key requirement for any KDD and data mining algorithm, and one of the biggest research challenges is to develop methods that allow to use large amounts of data. One possible approach for dealing with huge amounts of data is to take a random sample and do data mining on it, since for many data mining applications approximate answers are acceptable. However, as argued by several researchers, random sampling is difficult to use due to the difficulty of determining an appropriate sample size. In this paper, we take a sequential sampling approach for solving this difficulty, and propose an adaptive sampling method that solves a general problem covering many actual problems arising in applications of discovery science. An algorithm following this method obtains examples sequentially in an online fashion, and it determines from the obtained examples whether it has already seen a large enough number of examples. Thus, sample size is notfixed a priori; instead, it adaptively depends on the situation. Due to this adaptiveness, if we are not in a worst case situation as fortunately happens in many practical applications, then we can solve the problem with a number of examples much smaller than the required in the worst case. We prove the correctness of our method and estimates its efficiency theoretically. For illustrating its usefulness, we consider one concrete example of using sampling, provide an algorithm based on our method, and show its efficiency by experimental evaluation.
Well-Trained PETs: Improving Probability Estimation Trees
, 2000
"... Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
Decision trees are one of the most effective and widely used classification methods. However, many applications require class probability estimates, and probability estimation trees (PETs) have the same attractive features as classification trees (e.g., comprehensibility, accuracy and efficiency in high dimensions and on large data sets). Unfortunately, decision trees have been found to provide poor probability estimates. Several techniques have been proposed to build more accurate PETs, but, to our knowledge, there has not been a systematic experimental analysis of which techniques actually improve the probability estimates, and by how much. In this paper we first discuss why the decision-tree representation is not intrinsically inadequate for probability estimation. Inaccurate probabilities are partially the result of decision-tree induction algorithms that focus on maximizing classification accuracy and minimizing tree size (for example via reduced-error pruning). Larger tree...
A Comparative Study of Cost-Sensitive Boosting Algorithms
- In Proceedings of the 17th International Conference on Machine Learning
"... This paper describes a study of different adaptations of boosting algorithms for cost-sensitive classification. The purpose of the study is to improve our understanding of the behavior of various cost-sensitive boosting algorithms and how variations in the boosting procedure affect misclassification ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
This paper describes a study of different adaptations of boosting algorithms for cost-sensitive classification. The purpose of the study is to improve our understanding of the behavior of various cost-sensitive boosting algorithms and how variations in the boosting procedure affect misclassification cost and high cost error. We find that boosting can be simplified for cost-sensitive classification. A new variant, which excludes a factor used in ordinary boosting, performs best at minimizing high cost errors and it almost always performs better than AdaBoost. We also find that cost-sensitive boosting seeks to minimize high cost errors rather than cost, and a minimum expected cost criterion, applied during classification, greatly enhances the performance of all cost-sensitive adaptations of boosting algorithms. We show a strong correlation between an algorithm that produces small model size and its success in reducing high cost errors. For a recently proposed method, AdaCost,...
Ten Measures of Diversity in Classifier Ensembles: Limits for Two Classifiers
- In Proc. of IEE Workshop on Intelligent Sensor Processing
, 2001
"... Independence and dependence of classifier outputs have been debated in the recent literature giving rise to notions such as diversity, complementarity, orthogonality, etc. There seems to be no consensus on the meaning of these notions beyond the intuitive perception. Here we summarize 10 measures of ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
Independence and dependence of classifier outputs have been debated in the recent literature giving rise to notions such as diversity, complementarity, orthogonality, etc. There seems to be no consensus on the meaning of these notions beyond the intuitive perception. Here we summarize 10 measures of classifier diversity: 4 pairwise and 6 non-pairwise measures. We derive the limits of the measures for 2 classifiers of equal accuracy.

