Results 1 - 10
of
23
Proximal support vector machine classifiers
- Proceedings KDD-2001: Knowledge Discovery and Data Mining
, 2001
"... Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the ..."
Abstract
-
Cited by 81 (11 self)
- Add to MetaCart
Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness. Index Terms—Support vector machines, proximal classification, generalized eigenvalues. 1
Lagrangian Support Vector Machines
, 2000
"... An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed. This leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points. Thi ..."
Abstract
-
Cited by 72 (11 self)
- Add to MetaCart
An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed. This leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points. This problem is solvable by an extremely simple linearly convergent Lagrangian support vector machine (LSVM) algorithm. LSVM requires the inversion at the outset of a single matrix of the order of the much smaller dimensionality of the original input space plus one. The full algorithm is given in this paper in 11 lines of MATLAB code without any special optimization tools such as linear or quadratic programming solvers. This LSVM code can be used "as is" to solve classification problems with millions of points. For example, 2 million points in 10 dimensional input space were classified by a linear surface in 82 minutes on a Pentium III 500 MHz notebook with 384 megabytes of memory (and additional swap space), and in 7 minutes on a 250 MHz UltraSPARC II processor with 2 gigabytes of memory. Other standard classification test problems were also solved. Nonlinear kernel classification can also be solved by LSVM. Although it does not scale up to very large problems, it can handle any positive semidefinite kernel and is guaranteed to converge.
Support Vector Machines: Hype or Hallelujah?
- SIGKDD Explorations
, 2003
"... Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification ..."
Abstract
-
Cited by 63 (0 self)
- Add to MetaCart
Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification problem is used to investigate the basic concepts behind SVMs and to examine their strengths and weaknesses from a data mining perspective. While this overview is not comprehensive, it does provide resources for those interested in further exploring SVMs.
Random features for large-scale kernel machines
- In Neural Infomration Processing Systems
, 2007
"... To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. Our randomized features are designed so that the inner products of the transformed data are approximately equal to those in the f ..."
Abstract
-
Cited by 49 (3 self)
- Add to MetaCart
To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. Our randomized features are designed so that the inner products of the transformed data are approximately equal to those in the feature space of a user specified shift-invariant kernel. We explore two sets of random features, provide convergence bounds on their ability to approximate various radial basis kernels, and show that in large-scale classification and regression tasks linear machine learning algorithms that use these features outperform state-of-the-art large-scale kernel machines. 1
Object-oriented software for quadratic programming
- ACM Transactions on Mathematical Software
, 2001
"... The object-oriented software package OOQP for solving convex quadratic programming problems (QP) is described. The primal-dual interior point algorithms supplied by OOQP are implemented in a way that is largely independent of the problem structure. Users may exploit problem structure by supplying li ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
The object-oriented software package OOQP for solving convex quadratic programming problems (QP) is described. The primal-dual interior point algorithms supplied by OOQP are implemented in a way that is largely independent of the problem structure. Users may exploit problem structure by supplying linear algebra, problem data, and variable classes that are customized to their particular applications. The OOQP distribution contains default implementations that solve several important QP problem types, including general sparse and dense QPs, bound-constrained QPs, and QPs arising from support vector machines and Huber regression. The implementations supplied with the OOQP distribution are based on such well known linear algebra packages as MA27/57, LAPACK, and PETSc. OOQP demonstrates the usefulness of object-oriented design in optimization software development, and establishes standards that can be followed in the design of software packages for other classes of optimization problems. A number of the classes in OOQP may also be reusable directly in other codes.
Active Support Vector Machine Classification
- Advances in Neural Information Processing Systems 13
, 2000
"... An active set strategy is applied to the dual of a simple reformulation of the standard quadratic program of a linear support vector machine. This application generates a fast new dual algorithm that consists of solving a finite number of linear equations, with a typically large dimensionality equal ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
An active set strategy is applied to the dual of a simple reformulation of the standard quadratic program of a linear support vector machine. This application generates a fast new dual algorithm that consists of solving a finite number of linear equations, with a typically large dimensionality equal to the number of points to be classified. However, by making novel use of the Sherman-Morrison-Woodbury formula, a much smaller matrix of the order of the original input space is inverted at each step. Thus, a problem with a 32-dimensional input space and 7 million points required inverting positive definite symmetric matrices of size 33 x 33 with a total running time of 96 minutes on a 400 MHz Pentium II. The algorithm requires no specialized quadratic or linear programming code, but merely a linear equation solver which is publicly available.
Exploiting Structure in Parallel Implementation of Interior Point Methods for Optimization
- School of Mathematics, University of Edinburgh, Edinburgh
, 2004
"... ..."
Large Linear Classification When Data Cannot Fit In Memory ABSTRACT
"... Recent advances in linear classification have shown that for applications such as document classification, the training can be extremely efficient. However, most of the existing training methods are designed by assuming that data can be stored in the computer memory. These methods cannot be easily a ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Recent advances in linear classification have shown that for applications such as document classification, the training can be extremely efficient. However, most of the existing training methods are designed by assuming that data can be stored in the computer memory. These methods cannot be easily applied to data larger than the memory capacity due to the random access to the disk. We propose and analyze a block minimization framework for data larger than the memory size. At each step a block of data is loaded from the disk and handled by certain learning methods. We investigate two implementations of the proposed framework for primal and dual SVMs, respectively. As data cannot fit in memory, many design considerations are very different from those for traditional algorithms. Experiments using data sets 20 times larger than the memory demonstrate the effectiveness of the proposed method.
Large-scale linear programming techniques for the design of protein folding potentials
- Mathematical Programming Ser. B
"... Abstract. We present large-scale optimization techniques to model the energy function that underlies the folding process of proteins. Linear Programming is used to identify parameters in the energy function model, the objective being that the model predict the structure of known proteins correctly. ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract. We present large-scale optimization techniques to model the energy function that underlies the folding process of proteins. Linear Programming is used to identify parameters in the energy function model, the objective being that the model predict the structure of known proteins correctly. Such trained functions can then be used either for ab-initio prediction or for recognition of unknown structures. In order to obtain good energy models we need to be able to solve dense Linear Programming Problems with tens (possibly hundreds) of millions of constraints in a few hundred parameters, which we achieve by tailoring and parallelizing the interior-point code PCx.

