Results 1  10
of
66
A training algorithm for optimal margin classifiers
 PROCEEDINGS OF THE 5TH ANNUAL ACM WORKSHOP ON COMPUTATIONAL LEARNING THEORY
, 1992
"... A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of classifiaction functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjust ..."
Abstract

Cited by 1277 (43 self)
 Add to MetaCart
A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of classifiaction functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leaveoneout method and the VCdimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.
Connectionist Learning Procedures
 ARTIFICIAL INTELLIGENCE
, 1989
"... A major goal of research on networks of neuronlike processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way ..."
Abstract

Cited by 338 (6 self)
 Add to MetaCart
A major goal of research on networks of neuronlike processing units is to discover efficient learning procedures that allow these networks to construct complex internal representations of their environment. The learning procedures must be capable of modifying the connection strengths in such a way that internal units which are not part of the input or output come to represent important features of the task domain. Several interesting gradientdescent procedures have recently been discovered. Each connection computes the derivative, with respect to the connection strength, of a global measure of the error in the performance of the network. The strength is then adjusted in the direction that decreases the error. These relatively simple, gradientdescent learning procedures work well for small tasks and the new challenge is to find ways of improving their convergence rate and their generalization abilities so that they can be applied to larger, more realistic tasks.
Mathematical Programming in Neural Networks
 ORSA Journal on Computing
, 1993
"... This paper highlights the role of mathematical programming, particularly linear programming, in training neural networks. A neural network description is given in terms of separating planes in the input space that suggests the use of linear programming for determining these planes. A more standard d ..."
Abstract

Cited by 40 (13 self)
 Add to MetaCart
This paper highlights the role of mathematical programming, particularly linear programming, in training neural networks. A neural network description is given in terms of separating planes in the input space that suggests the use of linear programming for determining these planes. A more standard description in terms of a mean square error in the output space is also given, which leads to the use of unconstrained minimization techniques for training a neural network. The linear programming approach is demonstrated by a brief description of a system for breast cancer diagnosis that has been in use for the last four years at a major medical facility. 1 What is a Neural Network? A neural network is a representation of a map between an input space and an output space. A principal aim of such a map is to discriminate between the elements of a finite number of disjoint sets in the input space. Typically one wishes to discriminate between the elements of two disjoint point sets in the ndim...
Bilinear Separation of Two Sets in nSpace
 COMPUTATIONAL OPTIMIZATION AND APPLICATIONS
, 1993
"... The NPcomplete problem of determining whether two disjoint point sets in the ndimensional real space R n can be separated by two planes is cast as a bilinear program, that is minimizing the scalar product of two linear functions on a polyhedral set. The bilinear program, which has a vertex solut ..."
Abstract

Cited by 35 (17 self)
 Add to MetaCart
The NPcomplete problem of determining whether two disjoint point sets in the ndimensional real space R n can be separated by two planes is cast as a bilinear program, that is minimizing the scalar product of two linear functions on a polyhedral set. The bilinear program, which has a vertex solution, is processed by an iterative linear programming algorithm that terminates in a finite number of steps at a point satisfying a necessary optimality condition or at a global minimum. Encouraging computational experience on a number of test problems is reported.
Learning Linear Threshold Functions in the Presence of Classification Noise
 In Proceedings of the Seventh Annual Workshop on Computational Learning Theory
, 1994
"... I show that linear threshold functions are polynomially learnable in the presence of classification noise, i.e., polynomial in n, 1=ffl, 1=ffi, and 1=oe, where n is the number of Boolean attributes, ffl and ffi are the usual accuracy and confidence parameters, and oe indicates the minimum dist ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
I show that linear threshold functions are polynomially learnable in the presence of classification noise, i.e., polynomial in n, 1=ffl, 1=ffi, and 1=oe, where n is the number of Boolean attributes, ffl and ffi are the usual accuracy and confidence parameters, and oe indicates the minimum distance of any example from the target hyperplane, which is assumed to be lower than the average distance of the examples from any hyperplane. This result is achieved by modifying the Perceptron algorithmfor each update, a weighted average of a sample of misclassified examples and a correction vector is added to the current weight vector. Similar modifications are shown for the Weighted Majority algorithm. The correction vector is simply the mean of the normalized examples. In the special case of Boolean threshold functions, the modified Perceptron algorithm performs O(n 2 ffl \Gamma2 ) iterations over O(n 4 ffl \Gamma2 ln(n=(ffiffl))) examples. This improves on the pre...
Support vector machine soft margin classifiers: Error analysis
 Journal of Machine Learning Research
, 2004
"... The purpose of this paper is to provide a PAC error analysis for the qnorm soft margin classifier, a support vector machine classification algorithm. It consists of two parts: regularization error and sample error. While many techniques are available for treating the sample error, much less is know ..."
Abstract

Cited by 23 (14 self)
 Add to MetaCart
The purpose of this paper is to provide a PAC error analysis for the qnorm soft margin classifier, a support vector machine classification algorithm. It consists of two parts: regularization error and sample error. While many techniques are available for treating the sample error, much less is known for the regularization error and the corresponding approximation error for reproducing kernel Hilbert spaces. We are mainly concerned about the regularization error. It is estimated for general distributions by a Kfunctional in weighted L q spaces. For weakly separable distributions (i.e., the margin may be zero) satisfactory convergence rates are provided by means of separating functions. A projection operator is introduced, which leads to better sample error estimates especially for small complexity kernels. The misclassification error is bounded by the Vrisk associated with a general class of loss functions V. The difficulty of bounding the offset is overcome. Polynomial kernels and Gaussian kernels are used to demonstrate the main results. The choice of the regularization parameter plays an important role in our analysis.
Computational Complexity Of Neural Networks: A Survey
, 1994
"... . We survey some of the central results in the complexity theory of discrete neural networks, with pointers to the literature. Our main emphasis is on the computational power of various acyclic and cyclic network models, but we also discuss briefly the complexity aspects of synthesizing networks fr ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
. We survey some of the central results in the complexity theory of discrete neural networks, with pointers to the literature. Our main emphasis is on the computational power of various acyclic and cyclic network models, but we also discuss briefly the complexity aspects of synthesizing networks from examples of their behavior. CR Classification: F.1.1 [Computation by Abstract Devices]: Models of Computationneural networks, circuits; F.1.3 [Computation by Abstract Devices ]: Complexity Classescomplexity hierarchies Key words: Neural networks, computational complexity, threshold circuits, associative memory 1. Introduction The currently again very active field of computation by "neural" networks has opened up a wealth of fascinating research topics in the computational complexity analysis of the models considered. While much of the general appeal of the field stems not so much from new computational possibilities, but from the possibility of "learning", or synthesizing networks...
Large Vocabulary Recognition of Online Handwritten Cursive Words
, 1995
"... A critical feature of any computer system is its interface with the user. This has led to the development of user interface technologies such as mouse, touchscreen and penbased input devices. Since handwriting is one of the most familiar communication media, penbased interfaces combined with automa ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
A critical feature of any computer system is its interface with the user. This has led to the development of user interface technologies such as mouse, touchscreen and penbased input devices. Since handwriting is one of the most familiar communication media, penbased interfaces combined with automatic handwriting recognition offers a very easy and natural input method. Penbased interfaces are also essential in mobile computing because they are scalable. Recent advances in penbased hardware and wireless communication have been influential factors in the renewed interest in online recognition systems. Online handwriting recognition is fundamentally a pattern classification task; the objective is to take an input pattern, the handwritten signal collected online via a digitizing device, and classify it as one of a prespecified set of words (i.e., the system's lexicon). Because exact recognition is very difficult, a lexicon is used to constrain the recognition output to a known vocab...
Condition Number Complexity of an Elementary Algorithm for Resolving a Conic Linear System
, 1997
"... We develop an algorithm for resolving a conic linear system (FP d ), which is a system of the form (FP d ): b Ax 2 C Y x 2 CX ; where CX and C Y are closed convex cones, and the data for the system is d = (A; b). ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
We develop an algorithm for resolving a conic linear system (FP d ), which is a system of the form (FP d ): b Ax 2 C Y x 2 CX ; where CX and C Y are closed convex cones, and the data for the system is d = (A; b).
An analytical method for multiclass molecular cancer classification
 SIAM Review
, 2003
"... contributed equally to this work. & & Corresponding author. ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
contributed equally to this work. & & Corresponding author.