Results 1  10
of
10
Extremely Randomized Trees
 MACHINE LEARNING
, 2003
"... This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme ..."
Abstract

Cited by 130 (34 self)
 Add to MetaCart
This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme
Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization
, 2001
"... We study how close the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error function. The measurement of closeness is characterized by the loss function used in the estimation. ..."
Abstract

Cited by 112 (6 self)
 Add to MetaCart
We study how close the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error function. The measurement of closeness is characterized by the loss function used in the estimation. We show that such a classification scheme can be generally regarded as a (non maximumlikelihood) conditional inclass probability estimate, and we use this analysis to compare various convex loss functions that have appeared in the literature. Furthermore, the theoretical insight allows us to design good loss functions with desirable properties. Another aspect of our analysis is to demonstrate the consistency of certain classification methods using convex risk minimization.
An experimental study on diversity for bagging and boosting with linear classifiers
 Information Fusion
, 2002
"... ..."
MEGA  The Maximizing Expected Generalization Algorithm for Learning Complex Query Concepts
 ACM Transaction on Information Systems
, 2000
"... Specifying exact query concepts has become increasingly challenging to endusers. This is because many query concepts #e.g., those for looking up a multimedia object# can be hard to articulate, and articulation can be subjective. In this study,we propose a queryconcept learner that learns query ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
Specifying exact query concepts has become increasingly challenging to endusers. This is because many query concepts #e.g., those for looking up a multimedia object# can be hard to articulate, and articulation can be subjective. In this study,we propose a queryconcept learner that learns query criteria through an intelligent sampling process. Our concept learner aims to ful#ll two primary design objectives: 1# it has to be expressive in order to model most practical query concepts, and 2# it must learn a concept quickly and with a small number of labeled data since online users tend to be too impatient to provide much feedback. To ful#ll the #rst goal, we model query concepts in kCNF, which can express almost all practical query concepts. To ful#ll the second design goal, we propose our maximizing expected generalization algorithm #MEGA#, which converges to target concepts quickly by its two complementary steps: sample selection and concept re#nement. We also propose a divideandconquer method that divides the conceptlearning task into G subtasks to achieve speedup. We notice that a task must be divided carefully, or search accuracy may su#er. Wethus employ a geneticbased mining algorithm to discover good feature groupings. Through analysis and mining results, we observe that organizing image features in a multiresolution manner, and minimizing intragroup feature correlation, can speed up queryconcept learning substantially while maintaining high search accuracy. Through examples, analysis, experiments, and an prototype implementation, we show that MEGA converges to query concepts signi#cantly faster than traditional methods. Keywords: query concept, relevance feedback, active learning, data mining. 1
Boosting and Microarray Data
 MACHINE LEARNING
, 2003
"... We have found one reason why AdaBoost tends not to perform well on gene expression data, and identified simple modifications that improve its ability to find accurate class prediction rules. These modifications appear especially to be needed when there is a strong association between expression prof ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
We have found one reason why AdaBoost tends not to perform well on gene expression data, and identified simple modifications that improve its ability to find accurate class prediction rules. These modifications appear especially to be needed when there is a strong association between expression profiles and class designations. Crossvalidation analysis of six microarray datasets with different characteristics suggests that, suitably modified, boosting provides competitive classification accuracy in general. Sometimes the goal
Racing Committees for Large Datasets
 In Proceedings of the International Conference on Discovery Science
, 2002
"... This paper proposes a method for generating classifiers from large datasets by building a committee of simple base classifiers using a standard boosting algorithm. It permits the processing of large datasets even if the underlying base learning algorithm cannot efficiently do so. ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
This paper proposes a method for generating classifiers from large datasets by building a committee of simple base classifiers using a standard boosting algorithm. It permits the processing of large datasets even if the underlying base learning algorithm cannot efficiently do so.
Just how good is maximum entropy? An empirical investigation using ensembles of MEMD models for attributevalue grammars
"... Maximum entropy has been theoretically argued as being the principled way to estimate models that are only partially determined by some set of empirically observed constraints. However, such arguments hinge upon large sample behaviour, and it is unclear how well maximum entropy performs when this as ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Maximum entropy has been theoretically argued as being the principled way to estimate models that are only partially determined by some set of empirically observed constraints. However, such arguments hinge upon large sample behaviour, and it is unclear how well maximum entropy performs when this assumption is violated by small samples. Within the maximum entropy / minimum divergence (MEMD) framework, and when operating in the domain of parse selection, we estimate lower and upper bounds on the performance of such models. Maximum entropy, even when samples are small, is shown to produce models near the upper bound. In addition to prediction using single models, we also investigate how well maximum entropy compares with ensembles of MEMD models. Maximum entropy is found to be competitive with such ensembles. Since ensemble learning requires substantially more computational resources than single model learning, yet delivers similar results to maximum entropy, this is a useful finding.
Generalization Error of Combined Classifiers
 Journal of Computer and System Sciences
, 1997
"... this paper we present an upper bound on the generalization error of any thresholded convex combination of functions which are themselves thresholded convex combinations of functions in terms of the margin and the average complexity of the combined functions. Furthermore, by considering a single hidd ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
this paper we present an upper bound on the generalization error of any thresholded convex combination of functions which are themselves thresholded convex combinations of functions in terms of the margin and the average complexity of the combined functions. Furthermore, by considering a single hiddenlayer threshold network as a convex combination of single perceptrons we obtain a similar bound on the generalization error of such networks in terms of the margin and the average complexity of the perceptrons (where the average is in terms of the weights assigned to the perceptrons). The complexity of each perceptron in this result is related to the proportion of training examples which are close to the perceptron threshold. The measure of complexity suggested by existing VC bounds for threshold networks (see, for example, [3]) is related to the number of weights in the network. If a network classifies most examples with a large margin and the network's perceptrons have few examples close to threshold, then our measure of complexity can be considerably smaller
Overfit Bounds for Classification Algorithms
, 2000
"... A major issue in machine learning is managing the overfit of a learning algorithm. The overfit of an algorithm is the degree to which the concept learned is representative of the data available at the time the learning takes place, but not of the mechanism which generated the data. In the context of ..."
Abstract
 Add to MetaCart
A major issue in machine learning is managing the overfit of a learning algorithm. The overfit of an algorithm is the degree to which the concept learned is representative of the data available at the time the learning takes place, but not of the mechanism which generated the data. In the context of classification, the overfit is expressed as the difference between the degree of success of classification of the training set and that of the classification of a test set. This dissertation deals with the analysis of the overfit behavior of several classes of classification algorithms. The analysis provides insight into the the sources of overfit and yields bounds which can be used to control the overfit of some wellknown algorithms, such as classification trees, the perceptron and edited nearest neighbors. The structure of the dissertation is as ...