Results 1  10
of
45
Error Correlation And Error Reduction In Ensemble Classifiers
, 1996
"... Using an ensemble of classifiers, instead of a single classifier, can lead to improved generalization. The gains obtained by combining however, are often affected more by the selection of what is presented to the combiner, than by the actual combining method that is chosen. In this paper we focus ..."
Abstract

Cited by 172 (21 self)
 Add to MetaCart
(Show Context)
Using an ensemble of classifiers, instead of a single classifier, can lead to improved generalization. The gains obtained by combining however, are often affected more by the selection of what is presented to the combiner, than by the actual combining method that is chosen. In this paper we focus on data selection and classifier training methods, in order to "prepare" classifiers for combining. We review a combining framework for classification problems that quantifies the need for reducing the correlation among individual classifiers. Then, we discuss several methods that make the classifiers in an ensemble more complementary. Experimental results are provided to illustrate the benefits and pitfalls of reducing the correlation among classifiers, especially when the training data is in limited supply. 2 1 Introduction A classifier's ability to meaningfully respond to novel patterns, or generalize, is perhaps its most important property (Levin et al., 1990; Wolpert, 1990). In...
An empirical comparison of pattern recognition, neural nets, and machine learning classification methods
 In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence
, 1989
"... Classification methods from statistical pattern recognition, neural nets, and machine learning were applied to four realworld data sets. Each of these data sets has been previously analyzed and reported in the statistical, medical, or machine learning literature. The data sets are characterized by ..."
Abstract

Cited by 131 (2 self)
 Add to MetaCart
Classification methods from statistical pattern recognition, neural nets, and machine learning were applied to four realworld data sets. Each of these data sets has been previously analyzed and reported in the statistical, medical, or machine learning literature. The data sets are characterized by statisucal uncertainty; there is no completely accurate solution to these problems. Training and testing or resampling techniques are used to estimate the true error rates of the classification methods. Detailed attention is given to the analysis of performance of the neural nets using back propagation. For these problems, which have relatively few hypotheses and features, the machine learning procedures for rule induction or tree induction clearly performed best. 1
Dimensionality Reduction Using Genetic Algorithms
, 2000
"... Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern has a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving ..."
Abstract

Cited by 117 (8 self)
 Add to MetaCart
Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern has a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured, since each new feature may be a linear combination of all of the features in the original pattern vector. Here we present a new approach to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. A masking vector is also employed to perform simultaneous selection of a subset of the features. We employ this technique in combination with the knearestneighbor classification rule, and compare the results with classical feature selection and extraction techniques, including sequential floating forward feature selection, and linear discriminant analysis. We also present results for identification of favorable water binding sites on protein surfaces, an important problem in biochemistry and drug design.
Linear and Order Statistics Combiners for Pattern Classification
 Combining Artificial Neural Nets
, 1999
"... Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification resul ..."
Abstract

Cited by 72 (8 self)
 Add to MetaCart
(Show Context)
Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the "added" error. If N unbiased classifiers are combined by simple averaging, the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based nonlinear combiners, we derive expressions that indicate how much the median, the maximum and in general the ith order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.
Crossvalidation and the bootstrap: estimating the error rate of the predicting rule
, 1995
"... ..."
Small Sample Statistics for Classification Error Rates I: Error Rate Measurements
 Dept. of Inf. and Comp. Sci
, 1996
"... Several methods (independent subsamples, leaveoneout, crossvalidation, and bootstrapping) have been proposed for estimating the error rates of classifiers. The rationale behind the various estimators and the causes of the sometimes conflicting claims regarding their bias and precision are explore ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
Several methods (independent subsamples, leaveoneout, crossvalidation, and bootstrapping) have been proposed for estimating the error rates of classifiers. The rationale behind the various estimators and the causes of the sometimes conflicting claims regarding their bias and precision are explored in this paper. The biases and variances of each of the estimators are examined empirically. Crossvalidation, 10fold or greater, seems to be the best approach; the other methods are biased, have poorer precision, or are inconsistent. Though unbiased for linear discriminant classifiers, the 632b bootstrap estimator is biased for nearest neighbors classifiers, more so for single nearest neighbor than for three nearest neighbors. The 632b estimator is also biased for Cartstyle decision trees. Weiss' loo* estimator is unbiased and has better precision than crossvalidation for discriminant and nearest neighbors classifiers, but its lack of bias and improved precision for those classifiers do...
Bootstrap  inspired techniques in computation intelligence
 Signal Processing Magazine, IEEE
, 2007
"... [Ensemble of classifiers for incremental learning, data fusion, and missing feature analysis] This article is about the success story of a seemingly simple yet extremely powerful approach that has recently reached a celebrity status in statistical and engineering sciences. The hero of this story—boo ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
[Ensemble of classifiers for incremental learning, data fusion, and missing feature analysis] This article is about the success story of a seemingly simple yet extremely powerful approach that has recently reached a celebrity status in statistical and engineering sciences. The hero of this story—bootstrap resampling—is relatively young, but the story itself is a familiar one within the scientific community: a mathematician or a statistician conceives and formulates a theory that is first developed by fellow mathematicians and then brought to fame by other professionals, typically engineers, who point to many applications that can benefit from just such an approach. Signal processing boasts some of the finest examples of such stories, such as the classic story of Fourier transforms or the more contemporary tale of wavelet transforms. Originally developed for estimating sampling distributions of statistical estimators from limited data, bootstrap techniques have since found applications in many areas of engineering— including signal processing—several examples of which appear elsewhere in this issue. This 10535888/07/$25.00©2007IEEE IEEE SIGNAL PROCESSING MAGAZINE [59] MAY JULY 2007 10535888/07/$25.00©2007IEEEarticle, however, is about bootstrapinspired techniques in computational
Learning pattern classification  A survey
 IEEE TRANS. INFORM. THEORY
, 1998
"... Classical and recent results in statistical pattern recognition and learning theory are reviewed in a twoclass pattern classification setting. This basic model best illustrates intuition and analysis techniques while still containing the essential features and serving as a prototype for many applic ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
Classical and recent results in statistical pattern recognition and learning theory are reviewed in a twoclass pattern classification setting. This basic model best illustrates intuition and analysis techniques while still containing the essential features and serving as a prototype for many applications. Topics discussed include nearest neighbor, kernel, and histogram methods, Vapnik–Chervonenkis theory, and neural networks. The presentation and the large (thogh nonexhaustive) list of references is geared to provide a useful overview of this field for both specialists and nonspecialists.
Classifier Combining: Analytical Results and Implications
 In Proceedings of the AAAI96 Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms
, 1995
"... Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This paper summarizes our recent theoretical results that quantify the improvements due to multiple c ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This paper summarizes our recent theoretical results that quantify the improvements due to multiple classifier combining. Furthermore, we present an extension of this theory that leads to an estimate of the Bayes error rate. Practical aspects such as expressing the confidences in decisions and determining the best data partition/classifier selection are also discussed. Keywords: Linear combining, order statistics combining, Bayes error, error correlation, error reduction, ensemble networks, performance limits. Introduction Given infinite training data, consistent classifiers approximate the Bayesian decision boundaries to arbitrary precision, therefore providing similar generalizations (Geman, Bienenstock, & Doursat 1992). However, often only a limited portion of the pattern space is avai...
Unbiased Estimation of Ellipses by Bootstrapping
 IEEE PAMI
, 1996
"... A general method for eliminating the bias of nonlinear estimators using bootstrap is presented. Instead of the traditional mean bias we consider the definition of bias based on the median. The method is applied to the problem of fitting ellipse segments to noisy data. No assumption beyond being ind ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
(Show Context)
A general method for eliminating the bias of nonlinear estimators using bootstrap is presented. Instead of the traditional mean bias we consider the definition of bias based on the median. The method is applied to the problem of fitting ellipse segments to noisy data. No assumption beyond being independent identically distributed (i.i.d.) is made about the error distribution and experiments with both synthetic and real data prove the effectiveness of the technique. Index terms: implicit models, curve fitting, bootstrap, lowlevel processing. 1 Conic Fitting Image formation is a perspective projection of the 3D visual environment. Features extracted from a 2D image can be useful only if they preserve some of the geometric properties of the 3D object they correspond to. Collinearity and conicity are such properties, and therefore line and conic segments are widely used as geometric primitives in computer vision. Let f(u; `) = 0 be the implicit model of a geometric primitive in the ima...