Results 1 - 10
of
10
Extremely Randomized Trees
- MACHINE LEARNING
, 2003
"... This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme ..."
Abstract
-
Cited by 88 (30 self)
- Add to MetaCart
This paper presents a new learning algorithm based on decision tree ensembles. In opposition to the classical decision tree induction method, the trees of the ensemble are built by selecting the tests during their induction fully at random. This extreme
Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization
, 2001
"... We study how close the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error function. The measurement of closeness is characterized by the loss function used in the estimation. ..."
Abstract
-
Cited by 85 (4 self)
- Add to MetaCart
We study how close the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error function. The measurement of closeness is characterized by the loss function used in the estimation. We show that such a classification scheme can be generally regarded as a (non maximum-likelihood) conditional in-class probability estimate, and we use this analysis to compare various convex loss functions that have appeared in the literature. Furthermore, the theoretical insight allows us to design good loss functions with desirable properties. Another aspect of our analysis is to demonstrate the consistency of certain classification methods using convex risk minimization.
Boosting and Microarray Data
- MACHINE LEARNING
, 2003
"... We have found one reason why AdaBoost tends not to perform well on gene expression data, and identified simple modifications that improve its ability to find accurate class prediction rules. These modifications appear especially to be needed when there is a strong association between expression prof ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
We have found one reason why AdaBoost tends not to perform well on gene expression data, and identified simple modifications that improve its ability to find accurate class prediction rules. These modifications appear especially to be needed when there is a strong association between expression profiles and class designations. Cross-validation analysis of six microarray datasets with different characteristics suggests that, suitably modified, boosting provides competitive classification accuracy in general. Sometimes the goal
MEGA --- The Maximizing Expected Generalization Algorithm for Learning Complex Query Concepts
- ACM Transaction on Information Systems
, 2000
"... Specifying exact query concepts has become increasingly challenging to end-users. This is because many query concepts #e.g., those for looking up a multimedia object# can be hard to articulate, and articulation can be subjective. In this study,we propose a query-concept learner that learns query ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
Specifying exact query concepts has become increasingly challenging to end-users. This is because many query concepts #e.g., those for looking up a multimedia object# can be hard to articulate, and articulation can be subjective. In this study,we propose a query-concept learner that learns query criteria through an intelligent sampling process. Our concept learner aims to ful#ll two primary design objectives: 1# it has to be expressive in order to model most practical query concepts, and 2# it must learn a concept quickly and with a small number of labeled data since online users tend to be too impatient to provide much feedback. To ful#ll the #rst goal, we model query concepts in k-CNF, which can express almost all practical query concepts. To ful#ll the second design goal, we propose our maximizing expected generalization algorithm #MEGA#, which converges to target concepts quickly by its two complementary steps: sample selection and concept re#nement. We also propose a divide-and-conquer method that divides the concept-learning task into G subtasks to achieve speedup. We notice that a task must be divided carefully, or search accuracy may su#er. Wethus employ a genetic-based mining algorithm to discover good feature groupings. Through analysis and mining results, we observe that organizing image features in a multi-resolution manner, and minimizing intragroup feature correlation, can speed up query-concept learning substantially while maintaining high search accuracy. Through examples, analysis, experiments, and an prototype implementation, we show that MEGA converges to query concepts signi#cantly faster than traditional methods. Keywords: query concept, relevance feedback, active learning, data mining. 1
An experimental study on diversity for bagging and boosting with linear classifiers
- Information Fusion
, 2002
"... ..."
Racing Committees for Large Datasets
- In Proceedings of the International Conference on Discovery Science
, 2002
"... This paper proposes a method for generating classifiers from large datasets by building a committee of simple base classifiers using a standard boosting algorithm. It permits the processing of large datasets even if the underlying base learning algorithm cannot efficiently do so. ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
This paper proposes a method for generating classifiers from large datasets by building a committee of simple base classifiers using a standard boosting algorithm. It permits the processing of large datasets even if the underlying base learning algorithm cannot efficiently do so.
Just how good is maximum entropy? An empirical investigation using ensembles of MEMD models for attribute-value grammars
"... Maximum entropy has been theoretically argued as being the principled way to estimate models that are only partially determined by some set of empirically observed constraints. However, such arguments hinge upon large sample behaviour, and it is unclear how well maximum entropy performs when this as ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Maximum entropy has been theoretically argued as being the principled way to estimate models that are only partially determined by some set of empirically observed constraints. However, such arguments hinge upon large sample behaviour, and it is unclear how well maximum entropy performs when this assumption is violated by small samples. Within the maximum entropy / minimum divergence (MEMD) framework, and when operating in the domain of parse selection, we estimate lower and upper bounds on the performance of such models. Maximum entropy, even when samples are small, is shown to produce models near the upper bound. In addition to prediction using single models, we also investigate how well maximum entropy compares with ensembles of MEMD models. Maximum entropy is found to be competitive with such ensembles. Since ensemble learning requires substantially more computational resources than single model learning, yet delivers similar results to maximum entropy, this is a useful finding.
Generalization Error of Combined Classifiers
- Journal of Computer and System Sciences
, 1997
"... this paper we present an upper bound on the generalization error of any thresholded convex combination of functions which are themselves thresholded convex combinations of functions in terms of the margin and the average complexity of the combined functions. Furthermore, by considering a single hidd ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
this paper we present an upper bound on the generalization error of any thresholded convex combination of functions which are themselves thresholded convex combinations of functions in terms of the margin and the average complexity of the combined functions. Furthermore, by considering a single hidden-layer threshold network as a convex combination of single perceptrons we obtain a similar bound on the generalization error of such networks in terms of the margin and the average complexity of the perceptrons (where the average is in terms of the weights assigned to the perceptrons). The complexity of each perceptron in this result is related to the proportion of training examples which are close to the perceptron threshold. The measure of complexity suggested by existing VC bounds for threshold networks (see, for example, [3]) is related to the number of weights in the network. If a network classifies most examples with a large margin and the network's perceptrons have few examples close to threshold, then our measure of complexity can be considerably smaller
Overfit Bounds for Classification Algorithms
, 2000
"... A major issue in machine learning is managing the overfit of a learning algorithm. The overfit of an algorithm is the degree to which the concept learned is representative of the data available at the time the learning takes place, but not of the mechanism which generated the data. In the context of ..."
Abstract
- Add to MetaCart
A major issue in machine learning is managing the overfit of a learning algorithm. The overfit of an algorithm is the degree to which the concept learned is representative of the data available at the time the learning takes place, but not of the mechanism which generated the data. In the context of classification, the overfit is expressed as the difference between the degree of success of classification of the training set and that of the classification of a test set. This dissertation deals with the analysis of the overfit behavior of several classes of classification algorithms. The analysis provides insight into the the sources of overfit and yields bounds which can be used to control the overfit of some well-known algorithms, such as classification trees, the perceptron and edited nearest neighbors. The structure of the dissertation is as ...

