Results 1  10
of
57
SupportVector Networks
 Machine Learning
, 1995
"... The supportvector network is a new learning machine for twogroup classification problems. The machine conceptually implements the following idea: input vectors are nonlinearly mapped to a very highdimension feature space. In this feature space a linear decision surface is constructed. Special pr ..."
Abstract

Cited by 2188 (31 self)
 Add to MetaCart
The supportvector network is a new learning machine for twogroup classification problems. The machine conceptually implements the following idea: input vectors are nonlinearly mapped to a very highdimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the supportvector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to nonseparable training data.
A Comparison of Methods for Multiclass Support Vector Machines
 IEEE TRANS. NEURAL NETWORKS
, 2002
"... Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary class ..."
Abstract

Cited by 571 (15 self)
 Add to MetaCart
Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using largescale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such “alltogether” methods. We then compare their performance with three methods based on binary classifications: “oneagainstall,” “oneagainstone,” and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the “oneagainstone” and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.
An introduction to kernelbased learning algorithms
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2001
"... This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and ..."
Abstract

Cited by 377 (48 self)
 Add to MetaCart
This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and
Shape quantization and recognition with randomized trees
 Neural Computation
, 1997
"... We explore a new approach to shape recognition based on a virtually in nite family of binary features (\queries") of the image data, designed to accommodate prior information about shape invariance and regularity. Each query corresponds to a spatial arrangement ofseveral local topographic code ..."
Abstract

Cited by 183 (17 self)
 Add to MetaCart
We explore a new approach to shape recognition based on a virtually in nite family of binary features (\queries") of the image data, designed to accommodate prior information about shape invariance and regularity. Each query corresponds to a spatial arrangement ofseveral local topographic codes (\tags") which are in themselves too primitive and common to be informative about shape. All the discriminating power derives from relative angles and distances among the tags. The important attributes of the queries are (i) a natural partial ordering corresponding to increasing structure and complexity � (ii) semiinvariance, meaning that most shapes of a given class will answer the same way totwo queries which are successive in the ordering � and (iii) stability, since the queries are not based on distinguished points and substructures. No classi er based on the full feature set can be evaluated and it is impossible to determine a priori which arrangements are informative. Our approach istoselect informative features and build tree classi ers at the same time by inductive learning. In e ect, each tree provides an approximation to the full posterior where the features
Face Recognition: A Convolutional Neural Network Approach
 IEEE Transactions on Neural Networks
, 1997
"... Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult [43]. We present a hybrid neural network solution which compares favorably with other methods. The system combines local image sampling, a selforganizing map n ..."
Abstract

Cited by 155 (0 self)
 Add to MetaCart
Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult [43]. We present a hybrid neural network solution which compares favorably with other methods. The system combines local image sampling, a selforganizing map neural network, and a convolutional neural network. The selforganizing map provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides for partial invariance to translation, rotation, scale, and deformation. The convolutional network extracts successively larger features in a hierarchical set of layers. We present results using the KarhunenLoeve transform in place of the selforganizing map, and a multilayer perceptron in place of the convolutional netwo...
Simplified Support Vector Decision Rules
, 1996
"... A Support Vector Machine (SVM) is a universal learning machine whose decision surface is parameterized by a set of support vectors, and by a set of corresponding weights. An SVM is also characterized by a kernel function. Choice of the kernel determines whether the resulting SVM is a polynomial cla ..."
Abstract

Cited by 146 (4 self)
 Add to MetaCart
A Support Vector Machine (SVM) is a universal learning machine whose decision surface is parameterized by a set of support vectors, and by a set of corresponding weights. An SVM is also characterized by a kernel function. Choice of the kernel determines whether the resulting SVM is a polynomial classifier, a twolayer neural network, a radial basis function machine, or some other learning machine. SVMs are currently considerably slower in test phase than other approaches with similar generalization performance. To address this, we present a general method to significantly decrease the complexity of the decision rule obtained using an SVM. The proposed method computes an approximation to the decision rule in terms of a reduced set of vectors. These reduced set vectors are not support vectors and can in some cases be computed analytically. We give experimental results for three pattern recognition problems. The results show that the method can decrease the computational complexity of th...
Fast Kernel Classifiers With Online And Active Learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... Very high dimensional learning systems become theoretically possible when training examples are abundant. The computing cost then becomes the limiting factor. Any efficient learning algorithm should at least take a brief look at each example. But should all examples be given equal attention? This ..."
Abstract

Cited by 104 (17 self)
 Add to MetaCart
Very high dimensional learning systems become theoretically possible when training examples are abundant. The computing cost then becomes the limiting factor. Any efficient learning algorithm should at least take a brief look at each example. But should all examples be given equal attention? This contribution proposes an empirical answer. We first present an online SVM algorithm based on this premise. LASVM yields competitive misclassification rates after a single pass over the training examples, outspeeding stateoftheart SVM solvers. Then we show how active example selection can yield faster training, higher accuracies, and simpler models, using only a fraction of the training example labels.
Joint Induction of Shape Features and Tree Classifiers
 IEEE Trans. PAMI
, 1997
"... We introduce a very large family of binary features for twodimensional shapes. The salient ones for separating particular shapes are determined by inductive learning during the construction of classi cation trees. There is a feature for every possible geometric arrangement of local topographic code ..."
Abstract

Cited by 75 (6 self)
 Add to MetaCart
We introduce a very large family of binary features for twodimensional shapes. The salient ones for separating particular shapes are determined by inductive learning during the construction of classi cation trees. There is a feature for every possible geometric arrangement of local topographic codes. The arrangements express coarse constraints on relative angles and distances among the code locations and are nearly invariant to substantial a ne and nonlinear deformations. They are also partially ordered, which makes it possible to narrow the search for informative ones at each node of the tree. Di erent trees correspond to di erent aspects of shape. They are statistically weakly dependent due to randomization and are aggregated in a simple way. Adapting the algorithm to a shape family is then fully automatic once training samples are provided. As an illustration, we classify handwritten digits from the NIST database � the error rate is:7%.
Convolutional Networks for Images, Speech, and TimeSeries
, 1995
"... INTRODUCTION The ability of multilayer backpropagation networks to learn complex, highdimensional, nonlinear mappings from large collections of examples makes them obvious candidates for image recognition or speech recognition tasks (see PATTERN RECOGNITION AND NEURAL NETWORKS). In the traditional ..."
Abstract

Cited by 74 (3 self)
 Add to MetaCart
INTRODUCTION The ability of multilayer backpropagation networks to learn complex, highdimensional, nonlinear mappings from large collections of examples makes them obvious candidates for image recognition or speech recognition tasks (see PATTERN RECOGNITION AND NEURAL NETWORKS). In the traditional model of pattern recognition, a handdesigned feature extractor gathers relevant information from the input and eliminates irrelevant variabilities. A trainable classifier then categorizes the resulting feature vectors (or strings of symbols) into classes. In this scheme, standard, fullyconnected multilayer networks can be used as classifiers. A potentially more interesting scheme is to eliminate the feature extractor, feeding the network with "raw" inputs (e.g. normalized images), and to rely on backpropagation to turn the first few layers into an appropriate feature extractor. While this can be done with an ordinary fully connected feedforward network with some success for tasks