Results 1  10
of
33
Multicategory Classification by Support Vector Machines
 Computational Optimizations and Applications
, 1999
"... We examine the problem of how to discriminate between objects of three or more classes. Specifically, we investigate how twoclass discrimination methods can be extended to the multiclass case. We show how the linear programming (LP) approaches based on the work of Mangasarian and quadratic programm ..."
Abstract

Cited by 56 (0 self)
 Add to MetaCart
We examine the problem of how to discriminate between objects of three or more classes. Specifically, we investigate how twoclass discrimination methods can be extended to the multiclass case. We show how the linear programming (LP) approaches based on the work of Mangasarian and quadratic programming (QP) approaches based on Vapnik's Support Vector Machines (SVM) can be combined to yield two new approaches to the multiclass problem. In LP multiclass discrimination, a single linear program is used to construct a piecewise linear classification function. In our proposed multiclass SVM method, a single quadratic program is used to construct a piecewise nonlinear classification function. Each piece of this function can take the form of a polynomial, radial basis function, or even a neural network. For the k > 2 class problems, the SVM method as originally proposed required the construction of a twoclass SVM to separate each class from the remaining classes. Similarily, k twoclass linear programs can be used for the multiclass problem. We performed an empirical study of the original LP method, the proposed k LP method, the proposed single QP method and the original k QP methods. We discuss the advantages and disadvantages of each approach. 1 1
Mathematical Programming for Data Mining: Formulations and Challenges
 INFORMS Journal on Computing
, 1998
"... This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research ch ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research challenges, and outline opportunities for contributions by the optimization research communities. Towards these goals, we include formulations of the basic categories of data mining methods as optimization problems. We also provide examples of successful mathematical programming approaches to some data mining problems. keywords: data analysis, data mining, mathematical programming methods, challenges for massive data sets, classification, clustering, prediction, optimization. To appear: INFORMS: Journal of Compting, special issue on Data Mining, A. Basu and B. Golden (guest editors). Also appears as Mathematical Programming Technical Report 9801, Computer Sciences Department, University of Wi...
Constructive Neural Network Learning Algorithms for Pattern Classification
, 2000
"... Constructive learning algorithms offer an attractive approach for the incremental construction of nearminimal neuralnetwork architectures for pattern classification. They help overcome the need for ad hoc and often inappropriate choices of network topology in algorithms that search for suitable we ..."
Abstract

Cited by 45 (14 self)
 Add to MetaCart
Constructive learning algorithms offer an attractive approach for the incremental construction of nearminimal neuralnetwork architectures for pattern classification. They help overcome the need for ad hoc and often inappropriate choices of network topology in algorithms that search for suitable weights in a priori fixed network architectures. Several such algorithms are proposed in the literature and shown to converge to zero classification errors (under certain assumptions) on tasks that involve learning a binary to binary mapping (i.e., classification problems involving binaryvalued input attributes and two output categories). We present two constructive learning algorithms MPyramidreal and MTilingreal that extend the pyramid and tiling algorithms, respectively, for learning real to Mary mappings (i.e., classification problems involving realvalued input attributes and multiple output classes). We prove the convergence of these algorithms and empirically demonstrate their applicability to practical pattern classification problems. Additionally, we show how the incorporation of a local pruning step can eliminate several redundant neurons from MTilingreal networks.
Misclassification Minimization
 JOURNAL OF GLOBAL OPTIMIZATION
, 1994
"... The problem of minimizing the number of misclassified points by a plane, attempting to separate two point sets with intersecting convex hulls in ndimensional real space, is formulated as a linear program with equilibrium constraints (LPEC). This general LPEC can be converted to an exact penalty pro ..."
Abstract

Cited by 40 (13 self)
 Add to MetaCart
The problem of minimizing the number of misclassified points by a plane, attempting to separate two point sets with intersecting convex hulls in ndimensional real space, is formulated as a linear program with equilibrium constraints (LPEC). This general LPEC can be converted to an exact penalty problem with a quadratic objective and linear constraints. A FrankWolfetype algorithm is proposed for the penalty problem that terminates at a stationary point or a global solution. Novel aspects of the approach include: (i) A linear complementarity formulation of the step function that "counts" misclassifications, (ii) Exact penalty formulation without boundedness, nondegeneracy or constraint qualification assumptions, (iii) An exact solution extraction from the sequence of minimizers of the penalty function for a finite value of the penalty parameter for the general LPEC and an explicitly exact solution for the LPEC with uncoupled constraints, and (iv) A parametric quadratic programming form...
Mathematical Programming in Neural Networks
 ORSA Journal on Computing
, 1993
"... This paper highlights the role of mathematical programming, particularly linear programming, in training neural networks. A neural network description is given in terms of separating planes in the input space that suggests the use of linear programming for determining these planes. A more standard d ..."
Abstract

Cited by 40 (13 self)
 Add to MetaCart
This paper highlights the role of mathematical programming, particularly linear programming, in training neural networks. A neural network description is given in terms of separating planes in the input space that suggests the use of linear programming for determining these planes. A more standard description in terms of a mean square error in the output space is also given, which leads to the use of unconstrained minimization techniques for training a neural network. The linear programming approach is demonstrated by a brief description of a system for breast cancer diagnosis that has been in use for the last four years at a major medical facility. 1 What is a Neural Network? A neural network is a representation of a map between an input space and an output space. A principal aim of such a map is to discriminate between the elements of a finite number of disjoint sets in the input space. Typically one wishes to discriminate between the elements of two disjoint point sets in the ndim...
Bilinear Separation of Two Sets in nSpace
 COMPUTATIONAL OPTIMIZATION AND APPLICATIONS
, 1993
"... The NPcomplete problem of determining whether two disjoint point sets in the ndimensional real space R n can be separated by two planes is cast as a bilinear program, that is minimizing the scalar product of two linear functions on a polyhedral set. The bilinear program, which has a vertex solut ..."
Abstract

Cited by 35 (17 self)
 Add to MetaCart
The NPcomplete problem of determining whether two disjoint point sets in the ndimensional real space R n can be separated by two planes is cast as a bilinear program, that is minimizing the scalar product of two linear functions on a polyhedral set. The bilinear program, which has a vertex solution, is processed by an iterative linear programming algorithm that terminates in a finite number of steps at a point satisfying a necessary optimality condition or at a global minimum. Encouraging computational experience on a number of test problems is reported.
Mathematical Programming in Data Mining
 Data Mining and Knowledge Discovery
, 1996
"... Mathematical programming approaches to three fundamental problems will be described: feature selection, clustering and robust representation. The feature selection problem considered is that of discriminating between two sets while recognizing irrelevant and redundant features and suppressing them. ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
Mathematical programming approaches to three fundamental problems will be described: feature selection, clustering and robust representation. The feature selection problem considered is that of discriminating between two sets while recognizing irrelevant and redundant features and suppressing them. This creates a lean model that often generalizes better to new unseen data. Computational results on real data confirm improved generalization of leaner models. Clustering is exemplified by the unsupervised learning of patterns and clusters that may exist in a given database and is a useful tool for knowledge discovery in databases (KDD). A mathematical programming formulation of this problem is proposed that is theoretically justifiable and computationally implementable in a finite number of steps. A resulting kMedian Algorithm is utilized to discover very useful survival curves for breast cancer patients from a medical database. Robust representation is concerned with minimizing trained m...
Extracting Rules From Pruned Neural Networks for Breast Cancer Diagnosis
 Artificial Intelligence in Medicine
, 1996
"... A new algorithm for neural network pruning is presented. Using this algorithm, networks with small number of connections and high accuracy rates for breast cancer diagnosis are obtained. We will then describe how rules can be extracted from a pruned network by considering only a finite number of hid ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
A new algorithm for neural network pruning is presented. Using this algorithm, networks with small number of connections and high accuracy rates for breast cancer diagnosis are obtained. We will then describe how rules can be extracted from a pruned network by considering only a finite number of hidden unit activation values. The accuracy of the extracted rules is as high as the accuracy of the pruned network. For the breast cancer diagnosis problem, the concise rules extracted from the network achieve an accuracy rate of more than 95 % on the training data set and on the test data set. Keywords. Neural network pruning; penalty function; rule extraction; breast cancer diagnosis. 2 1 Introduction Neural networks techniques have recently been applied to many medical diagnostic problems [1, 2, 4, 5, 11, 22]. Although the predictive accuracy of neural networks is often higher than that of other methods or human experts, it is generally difficult to understand how the network arrives a...
Multicategory Discrimination via Linear Programming
 OPTIMIZATION METHODS AND SOFTWARE
, 1992
"... A single linear program is proposed for discriminating between the elements of k disjoint point sets in the ndimensional real space R n : When the conical hulls of the k sets are (k \Gamma 1)point disjoint in R n+1 , a kpiece piecewiselinear surface generated by the linear program completely ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
A single linear program is proposed for discriminating between the elements of k disjoint point sets in the ndimensional real space R n : When the conical hulls of the k sets are (k \Gamma 1)point disjoint in R n+1 , a kpiece piecewiselinear surface generated by the linear program completely separates the k sets. This improves on a previous linear programming approach which required that each set be linearly separable from the remaining k \Gamma 1 sets. When the conical hulls of the k sets are not (k \Gamma 1)point disjoint, the proposed linear program generates an errorminimizing piecewiselinear separator for the k sets. For this case it is shown that the null solution is never a unique solver of the linear program and occurs only under the rather rare condition when the mean of each point set equals the mean of the means of the other k \Gamma 1 sets. This makes the proposed linear computational programming formulation useful for approximately discriminating between k sets...
A FuzzyGenetic Approach to Breast Cancer Diagnosis
, 1999
"... The automatic diagnosis of breast cancer is an important, realworld medical problem. In this paper we focus on the Wisconsin breast cancer diagnosis (WBCD) problem, combining two methodologiesfuzzy systems and evolutionary algorithmsso as to automatically produce diagnostic systems. We find t ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
The automatic diagnosis of breast cancer is an important, realworld medical problem. In this paper we focus on the Wisconsin breast cancer diagnosis (WBCD) problem, combining two methodologiesfuzzy systems and evolutionary algorithmsso as to automatically produce diagnostic systems. We find that our fuzzygenetic approach produces systems exhibiting two prime characteristics: first, they attain high classification performance (the best shown to date), with the possibility of attributing a confidence measure to the output diagnosis; second, the resulting systems involve a few simple rules, and are therefore (human) interpretable. 1999 Elsevier Science B.V. All rights reserved. Keywords: Fuzzy systems; Genetic algorithms; Breast cancer diagnosis www.elsevier.com/locate/artmed 1.