Results 1 - 10
of
14
On combining classifiers
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1998
"... We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental ..."
Abstract
-
Cited by 749 (21 self)
- Add to MetaCart
We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions—the sum rule—outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.
An extended set of Haar-like features for rapid objection detection
- IEEE ICIP
"... Recently Viola et al. [5] have introduced a rapid object detection scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated haar-like features. These novel features significantly enrich the simple features of [5] and can also be calculated ef ..."
Abstract
-
Cited by 250 (4 self)
- Add to MetaCart
Recently Viola et al. [5] have introduced a rapid object detection scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated haar-like features. These novel features significantly enrich the simple features of [5] and can also be calculated efficiently. With these new rotated features our sample face detector shows off on average a 10 % lower false alarm rate at a given hit rate. We also present a novel post optimization procedure for a given boosted cascade improving on average the false alarm rate further by 12.5%. 1
Combining Multiple Representations and Classifiers for Handwritten Digit Recognition
- Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
, 1997
"... We investigate techniques to combine multiple representations of a handwritten digit to increase classification accuracy without significantly increasing system complexity or recognition time. We compare multiexpert and multistage combination techniques and discuss in detail in a comparative manner ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
We investigate techniques to combine multiple representations of a handwritten digit to increase classification accuracy without significantly increasing system complexity or recognition time. We compare multiexpert and multistage combination techniques and discuss in detail in a comparative manner methods for combining multiple learners: Voting, mixture of experts, stacking, boosting and cascading. In pen-based handwritten character recognition, the input is the dynamic movement of the pentip over the pressure sensitive tablet. There is also the image formed as a result of this movement. On a real-world database, we notice that the two multi-layer perceptron (MLP) neural network-based classifiers using separately these representations make errors on different patterns implying that a suitable combination of the two would lead to higher accuracy. Thus we implement and compare voting, mixture of experts, stacking and cascading. Combined classifiers have an error percentage less than ind...
A Theoretical Analysis of the Limits of Majority Voting Errors for Multiple Classifier Systems
, 2000
"... A robust character of combining diverse classifiers using a majority voting has recently been illustrated in the pattern recognition literature. Furthermore, negatively correlated classifiers turned out to offer further improvement of the majority voting performance even comparing to the idealised m ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
A robust character of combining diverse classifiers using a majority voting has recently been illustrated in the pattern recognition literature. Furthermore, negatively correlated classifiers turned out to offer further improvement of the majority voting performance even comparing to the idealised model with independent classifiers. However, negatively correlated classifiers represent a very unlikely situation in the real-world classification problems and their benefits usually remain out of reach. Nevertheless, it is theoretically possible to obtain 0% majority voting error using a finite number of classifiers at the error level lower than 50%. We attempt to show that structuring classifiers into relevant multistage organisations can widen this boundary as well as the limits of majority voting error even more. Introducing discrete error distributions for analysis, we show how majority voting errors and their limits depend on parameters of a multiple classifier system with hardened binary outputs (correct/incorrect). Moreover, we investigate sensitivity of boundary distributions of classifier outputs to small discrepancies modelled by the random changes of votes and propose new more stable patterns of boundary distributions. Finally, we show how organising classifiers into different structures can be used to widen the limits of majority voting errors and how this phenomenon can be effectively exploited.
Cascading Classifiers
- Kybernetika
, 1998
"... We propose a multistage recognition method built as a cascade of a multi-layer perceptron (MLP) and a k-nearest neighbor (k-NN) classifier. MLP, being a distributed method, generalizes to learn a "rule" and the k-NN, being a local method, learns the localized "exceptions" rejected by the "rule." Bec ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
We propose a multistage recognition method built as a cascade of a multi-layer perceptron (MLP) and a k-nearest neighbor (k-NN) classifier. MLP, being a distributed method, generalizes to learn a "rule" and the k-NN, being a local method, learns the localized "exceptions" rejected by the "rule." Because the rule-learner handles a large percentage of the examples using a simple and general rule, only a small subset of the training set is stored as exceptions during training. Similarly during testing, most patterns are handled by the MLP and few are handled by k-NN thus causing only a small increase in memory and computation. A multistage method like cascading is a better approach than multiexpert methods like voting and stacking where all learners are used for all cases; the extra computation and memory for the second learner is unnecessary if we are sufficiently certain that the first one's response is correct. We discuss how such a system can be trained using cross validation. This me...
Techniques for Combining Multiple Learners
- Proceedings of Engineering of Intelligent Systems
, 1998
"... Learners based on different paradigms can be combined for improved accuracy. Each learning method assumes a certain model that comes with a set of assumptions which may lead to error if the assumptions do not hold. Learning is an ill-posed problem and with finite data each algorithm converges to a d ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Learners based on different paradigms can be combined for improved accuracy. Each learning method assumes a certain model that comes with a set of assumptions which may lead to error if the assumptions do not hold. Learning is an ill-posed problem and with finite data each algorithm converges to a different solution and fails under different circumstances. Our previous experience with statistical and neural classifiers was that classifiers based on these paradigms do generalize differently, fail on different patterns and to a certain extent complement each other and thus we look for ways to combine them for higher accuracy. One way to get complementary classifiers is by using different input representations. The methods we investigate are voting, mixture of experts, stacking and cascading. We do experiments on three real-world applications: optical handwritten digit recognition, pen-based handwritten digit recognition and the estimation of road travel distances which is a regression pr...
Combining multiple representations for pen-based handwritten digit recognition
- ELEKTRIK: Turkish Journal of Electrical Engineering and Computer Sciences
, 2001
"... We investigate techniques to combine multiple representations of a handwritten digit to increase classification accuracy without significantly increasing system complexity or recognition time. In pen-based recognition, the input is the dynamic movement of the pentip over the pressure sensitive table ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
We investigate techniques to combine multiple representations of a handwritten digit to increase classification accuracy without significantly increasing system complexity or recognition time. In pen-based recognition, the input is the dynamic movement of the pentip over the pressure sensitive tablet. There is also the image formed as a result of this movement. On a real-world database of handwritten digits containing more than 11,000 handwritten digits, we notice that the two multi-layer perceptron (MLP) based classifiers using these representations make errors on different patterns implying that a suitable combination of the two would lead to higher accuracy. We implement and compare voting, mixture of experts, stacking and cascading. Combining the two MLP classifiers we indeed get higher accuracy because the two classifiers/representations fail on different patterns. We especially advocate multistage cascading scheme where the second costlier image-based classifier is employed only in a small percentage of cases. 1.
A.: Theoretical and Experimental Analysis of a Two-Stage System for Classification
- IEEE Trans. on Pattern Analysis and Machine Intelligence
"... AbstractÐWe consider a popular approach to multicategory classification tasks: a two-stage system based on a first (global) classifier with rejection followed by a (local) nearest-neighbor classifier. Patterns which are not rejected by the first classifier are classified according to its output. Rej ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
AbstractÐWe consider a popular approach to multicategory classification tasks: a two-stage system based on a first (global) classifier with rejection followed by a (local) nearest-neighbor classifier. Patterns which are not rejected by the first classifier are classified according to its output. Rejected patterns are passed to the nearest-neighbor classifier together with the top-h ranking classes returned by the first classifier. The nearest-neighbor classifier, looking at patterns in the top-h classes, classifies the rejected pattern. An editing strategy for the nearest-neighbor reference database, controlled by the first classifier, is also considered. We analyze this system, showing that even if the first level and nearest-neighbor classifiers are not optimal in a Bayes sense, the system as a whole may be optimal. Moreover, we formally relate the response time of the system to the rejection rate of the first classifier and to the other system parameters. The error-response time trade-off is also discussed. Finally, we experimentally study two instances of the system applied to the recognition of handwritten digits. In one system, the first classifier is a fuzzy basis functions network, while in the second system it is a feed-forward neural network. Classification results as well as response times for different settings of the system parameters are reported for both systems. Index TermsÐMulticategory classification, rejection, global and local classification, hierarchical classifier, Bayes classifier. 1
MultiStage Cascading of Multiple Classifiers: One Man's Noise is Another Man's Data
- In Proceedings of the 17th International Conference on Machine Learning (ICML-2000
, 2000
"... For building implementable and industry- valuable classification solutions, machine learning methods must focus not only on accuracy but also on computational and space complexity. We discuss a multistage method, namely cascading, where there is a sequence of classifiers ordered in terms of in ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
For building implementable and industry- valuable classification solutions, machine learning methods must focus not only on accuracy but also on computational and space complexity. We discuss a multistage method, namely cascading, where there is a sequence of classifiers ordered in terms of increasing complexity and specificity such that early classifiers are simple and general whereas later ones are more complex and specific, being localized on patterns rejected by the previous classifiers. We present the technique and its rationale and validate its use by comparing it with the individual classifiers as well as the widely accepted ensemble methods bagging and Adaboost on eight data sets from the UCI repository. We do see that cascading increases accuracy without the concomitant increase in complexity and cost.
Sequential Selection Of Discrete Features For Neural Networks - A Bayesian Approach To Building A Cascade
, 1999
"... A feature selection procedure is used to successively remove features one-by-one from a statistical classifier by an iterative backward search. Each classifier uses a smaller subset of features than the classifier in the previous iteration. The classifiers are subsequently combined into a cascade. E ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
A feature selection procedure is used to successively remove features one-by-one from a statistical classifier by an iterative backward search. Each classifier uses a smaller subset of features than the classifier in the previous iteration. The classifiers are subsequently combined into a cascade. Each classifier in the cascade should classify cases to which a reliable class label can be assigned. Other cases should be propagated to the next classifier which uses also the value of a new feature. Experiments demonstrate the feasibility of building cascades of classifiers (neural networks for prediction of atrial fibrillation (FA)) using a backward search scheme for feature selection. <3 1999 Elsevier Science B.V. All rights reserved.

