Results 1 
9 of
9
Multicategory Classification by Support Vector Machines
 Computational Optimizations and Applications
, 1999
"... We examine the problem of how to discriminate between objects of three or more classes. Specifically, we investigate how twoclass discrimination methods can be extended to the multiclass case. We show how the linear programming (LP) approaches based on the work of Mangasarian and quadratic programm ..."
Abstract

Cited by 56 (0 self)
 Add to MetaCart
We examine the problem of how to discriminate between objects of three or more classes. Specifically, we investigate how twoclass discrimination methods can be extended to the multiclass case. We show how the linear programming (LP) approaches based on the work of Mangasarian and quadratic programming (QP) approaches based on Vapnik's Support Vector Machines (SVM) can be combined to yield two new approaches to the multiclass problem. In LP multiclass discrimination, a single linear program is used to construct a piecewise linear classification function. In our proposed multiclass SVM method, a single quadratic program is used to construct a piecewise nonlinear classification function. Each piece of this function can take the form of a polynomial, radial basis function, or even a neural network. For the k > 2 class problems, the SVM method as originally proposed required the construction of a twoclass SVM to separate each class from the remaining classes. Similarily, k twoclass linear programs can be used for the multiclass problem. We performed an empirical study of the original LP method, the proposed k LP method, the proposed single QP method and the original k QP methods. We discuss the advantages and disadvantages of each approach. 1 1
Multicategory Discrimination via Linear Programming
 OPTIMIZATION METHODS AND SOFTWARE
, 1992
"... A single linear program is proposed for discriminating between the elements of k disjoint point sets in the ndimensional real space R n : When the conical hulls of the k sets are (k \Gamma 1)point disjoint in R n+1 , a kpiece piecewiselinear surface generated by the linear program completely ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
A single linear program is proposed for discriminating between the elements of k disjoint point sets in the ndimensional real space R n : When the conical hulls of the k sets are (k \Gamma 1)point disjoint in R n+1 , a kpiece piecewiselinear surface generated by the linear program completely separates the k sets. This improves on a previous linear programming approach which required that each set be linearly separable from the remaining k \Gamma 1 sets. When the conical hulls of the k sets are not (k \Gamma 1)point disjoint, the proposed linear program generates an errorminimizing piecewiselinear separator for the k sets. For this case it is shown that the null solution is never a unique solver of the linear program and occurs only under the rather rare condition when the mean of each point set equals the mean of the means of the other k \Gamma 1 sets. This makes the proposed linear computational programming formulation useful for approximately discriminating between k sets...
An ordering algorithm for pattern presentation in Fuzzy ARTMAP that tends to improve generalization performance
 IEEE Transactions on Neural Networks
, 1999
"... Abstract — In this paper we introduce a procedure, based on the max–min clustering method, that identifies a fixed order of training pattern presentation for fuzzy adaptive resonance theory mapping (ARTMAP). This procedure is referred to as the ordering algorithm, and the combination of this procedu ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
Abstract — In this paper we introduce a procedure, based on the max–min clustering method, that identifies a fixed order of training pattern presentation for fuzzy adaptive resonance theory mapping (ARTMAP). This procedure is referred to as the ordering algorithm, and the combination of this procedure with fuzzy ARTMAP is referred to as ordered fuzzy ARTMAP. Experimental results demonstrate that ordered fuzzy ARTMAP exhibits a generalization performance that is better than the average generalization performance of fuzzy ARTMAP, and in certain cases as good as, or better than the best fuzzy ARTMAP generalization performance. We also calculate the number of operations required by the ordering algorithm and compare it to the number of operations required by the training phase of fuzzy ARTMAP. We show that, under mild assumptions, the number of operations required by the ordering algorithm is a fraction of the number of operations required by fuzzy ARTMAP. Index Terms — Fuzzy ARTMAP, generalization, learning, max–min clustering.
Serial and Parallel Multicategory Discrimination
 SIAM Journal on Optimization
, 1994
"... A parallel algorithm is proposed for a fundamental problem of machine learning, that of multicategory discrimination. The algorithm is based on minimizing an error function associated with a set of highly structured linear inequalities. These inequalities characterize piecewiselinear separation of k ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
A parallel algorithm is proposed for a fundamental problem of machine learning, that of multicategory discrimination. The algorithm is based on minimizing an error function associated with a set of highly structured linear inequalities. These inequalities characterize piecewiselinear separation of k sets by the maximum of k affine functions. The error function has a Lipschitz continuous gradient that allows the use of fast serial and parallel unconstrained minimization algorithms. A serial quasiNewton algorithm is considerably faster than previous linear programming formulations. A parallel gradient distribution algorithm is used to parallelize the errorminimization problem. Preliminary computational results are given for both a DECstation 5000/125 and a Thinking Machines Corporation CM5 multiprocessor. 1 Introduction We consider a fundamental problem of machine learning and pattern recognition, that of discriminating between k sets. Given k disjoint sets, A i ; i = 1; : : : ; k;...
The Performance Of Statistical Pattern Recognition Methods In High Dimensional Settings
 IEEE Signal Processing Workshop on Higher Order Statistics. Ceasarea
, 1994
"... We report on an extensive simulation study comparing eight statistical classification methods, focusing on problems where the number of observations is less than the number of variables. Using a wide range of artificial and real data, two types of classifiers were contrasted; methods that classify u ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We report on an extensive simulation study comparing eight statistical classification methods, focusing on problems where the number of observations is less than the number of variables. Using a wide range of artificial and real data, two types of classifiers were contrasted; methods that classify using all variables, and methods that first reduce the number of dimensions to two or three. The full feature space methods include linear, quadratic and regularized discriminant analysis, and the nearest neighbour method. The four dimensionality reducing classifiers are characterized by the transform they implement. The four transforms compared are the Fisher discriminant plane, the FisherFukunagaKoonz, the Fisherradius, and the Fishervariance transforms. The FisherFukunaga and the Fisherradius transform based classifiers have recently been proposed for two class classification problems. We also present an extension to these transforms such that they can be applied to classification pro...
Exploratory observation machine (XOM) with KullbackLeibler divergence for dimensionality reduction and visualization
 European Symposium on Artificial Neural Networks (ESANN 2010
, 2010
"... Abstract. We present an extension of the Exploratory Observation Machine (XOM) for structurepreserving dimensionality reduction. Based on minimizing the KullbackLeibler divergence of neighborhood functions in data and image spaces, this Neighbor Embedding XOM (NEXOM) creates a link between fast s ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. We present an extension of the Exploratory Observation Machine (XOM) for structurepreserving dimensionality reduction. Based on minimizing the KullbackLeibler divergence of neighborhood functions in data and image spaces, this Neighbor Embedding XOM (NEXOM) creates a link between fast sequential online learning known from topologypreserving mappings and principled direct divergence optimization approaches. We quantitatively evaluate our method on real world data using multiple embedding quality measures. In this comparison, NEXOM performs as a competitive tradeoff between high embedding quality and low computational expense, which motivates its further use in realworld settings throughout science and engineering. 1
From Approximative to Descriptive Models: A Realistic Case Study
, 2000
"... This paper presents the results of an application study on an effective and efficient technique which translates rules that use approximative sets to rules that use descriptive sets and linguistic hedges of predefined meaning. The translated descriptive rules are established to be functionally equiv ..."
Abstract
 Add to MetaCart
This paper presents the results of an application study on an effective and efficient technique which translates rules that use approximative sets to rules that use descriptive sets and linguistic hedges of predefined meaning. The translated descriptive rules are established to be functionally equivalent to the original approximative ones, or the closest equivalence possible, while reecting their underlying semantics. It is shown that descriptive models can be obtained via taking the advantages of existing approaches to approximative modelling that are efficient and accurate.
MKNN: Modified KNearest Neighbor
"... Abstract — In this paper, a new classification method for enhancing the performance of KNearest Neighbor is proposed which uses robust neighbors in training data. This new classification method is called Modified KNearest Neighbor, MKNN. Inspired the traditional KNN algorithm, the main idea is cla ..."
Abstract
 Add to MetaCart
Abstract — In this paper, a new classification method for enhancing the performance of KNearest Neighbor is proposed which uses robust neighbors in training data. This new classification method is called Modified KNearest Neighbor, MKNN. Inspired the traditional KNN algorithm, the main idea is classifying the test samples according to their neighbor tags. This method is a kind of weighted KNN so that these weights are determined using a different procedure. The procedure computes the fraction of the same labeled neighbors to the total number of neighbors. The proposed method is evaluated on five different data sets. Experiments show the excellent improvement in accuracy in comparison with KNN method.
Adaptive Local Dissimilarity Measures for Discriminative Dimension Reduction of Labeled Data
"... Due to the tremendous increase of electronic information with respect to the size of data sets as well as their dimension, dimension reduction and visualization of highdimensional data has become one of the key problems of data mining. Since embedding in lower dimensions necessarily includes a loss ..."
Abstract
 Add to MetaCart
Due to the tremendous increase of electronic information with respect to the size of data sets as well as their dimension, dimension reduction and visualization of highdimensional data has become one of the key problems of data mining. Since embedding in lower dimensions necessarily includes a loss of information, methods to explicitly control the information kept by a specific dimension reduction technique are highly desirable. The incorporation of supervised class information constitutes an important specific case. The aim is to preserve and potentially enhance the discrimination of classes in lower dimensions. In this contribution we use an extension of prototypebased local distance learning, which results in a nonlinear discriminative dissimilarity measure for a given labeled data manifold. The learned local distance measure can be used as basis for other unsupervised dimension reduction techniques, which take into account neighborhood information. We show the combination of different dimension reduction techniques with a discriminative similarity measure learned by an extension of Learning Vector Quantization (LVQ) and their behavior with different parameter settings. The methods are introduced and discussed in terms of artificial and real world data sets.