Results 1 - 10
of
72
Proximal support vector machine classifiers
- Proceedings KDD-2001: Knowledge Discovery and Data Mining
, 2001
"... Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the ..."
Abstract
-
Cited by 80 (11 self)
- Add to MetaCart
Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness. Index Terms—Support vector machines, proximal classification, generalized eigenvalues. 1
Lagrangian Support Vector Machines
, 2000
"... An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed. This leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points. Thi ..."
Abstract
-
Cited by 72 (11 self)
- Add to MetaCart
An implicit Lagrangian for the dual of a simple reformulation of the standard quadratic program of a linear support vector machine is proposed. This leads to the minimization of an unconstrained differentiable convex function in a space of dimensionality equal to the number of classified points. This problem is solvable by an extremely simple linearly convergent Lagrangian support vector machine (LSVM) algorithm. LSVM requires the inversion at the outset of a single matrix of the order of the much smaller dimensionality of the original input space plus one. The full algorithm is given in this paper in 11 lines of MATLAB code without any special optimization tools such as linear or quadratic programming solvers. This LSVM code can be used "as is" to solve classification problems with millions of points. For example, 2 million points in 10 dimensional input space were classified by a linear surface in 82 minutes on a Pentium III 500 MHz notebook with 384 megabytes of memory (and additional swap space), and in 7 minutes on a 250 MHz UltraSPARC II processor with 2 gigabytes of memory. Other standard classification test problems were also solved. Nonlinear kernel classification can also be solved by LSVM. Although it does not scale up to very large problems, it can handle any positive semidefinite kernel and is guaranteed to converge.
Core vector machines: Fast SVM training on very large data sets
- Journal of Machine Learning Research
, 2005
"... Standard SVM training has O(m 3) time and O(m 2) space complexities, where m is the training set size. It is thus computationally infeasible on very large data sets. By observing that practical SVM implementations only approximate the optimal solution by an iterative strategy, we scale up kernel met ..."
Abstract
-
Cited by 61 (11 self)
- Add to MetaCart
Standard SVM training has O(m 3) time and O(m 2) space complexities, where m is the training set size. It is thus computationally infeasible on very large data sets. By observing that practical SVM implementations only approximate the optimal solution by an iterative strategy, we scale up kernel methods by exploiting such “approximateness ” in this paper. We first show that many kernel methods can be equivalently formulated as minimum enclosing ball (MEB) problems in computational geometry. Then, by adopting an efficient approximate MEB algorithm, we obtain provably approximately optimal solutions with the idea of core sets. Our proposed Core Vector Machine (CVM) algorithm can be used with nonlinear kernels and has a time complexity that is linear in m and a space complexity that is independent of m. Experiments on large toy and realworld data sets demonstrate that the CVM is as accurate as existing SVM implementations, but is much faster and can handle much larger data sets than existing scale-up methods. For example, CVM with the Gaussian kernel produces superior results on the KDDCUP-99 intrusion detection data, which has about five million training patterns, in only 1.4 seconds on a 3.2GHz Pentium–4 PC.
Training a support vector machine in the primal
- Neural Computation
, 2007
"... Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and non-linear SVMs, and that there is no reason for ignoring this possibilty. On the cont ..."
Abstract
-
Cited by 47 (5 self)
- Add to MetaCart
Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and non-linear SVMs, and that there is no reason for ignoring this possibilty. On the contrary, from the primal point of view new families of algorithms for large scale SVM training can be investigated.
Regularized Least-Squares Classification
"... We consider the solution of binary classification problems via Tikhonov regularization in a Reproducing Kernel Hilbert Space using the square loss, and denote the resulting algorithm Regularized Least-Squares Classification (RLSC). We sketch ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
We consider the solution of binary classification problems via Tikhonov regularization in a Reproducing Kernel Hilbert Space using the square loss, and denote the resulting algorithm Regularized Least-Squares Classification (RLSC). We sketch
Classifying Large Data Sets Using SVM with Hierarchical Clusters
- in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2003
"... Support vector machine (SVM) has been a promising method for classification and regression analysis because of its solid mathematical foundation which conveys several salient properties that other methods do not provide. However, despite the prominent properties of SVM, it is not as favored for larg ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
Support vector machine (SVM) has been a promising method for classification and regression analysis because of its solid mathematical foundation which conveys several salient properties that other methods do not provide. However, despite the prominent properties of SVM, it is not as favored for large-scale data mining as for pattern recognition or machine learning because the training complexity of SVM is highly dependent on the size of a data set. Many real-world data mining applications involve millions or billions of data records where even multiple scans of the entire data are too expensive to perform. This paper presents a new method, Clustering-Based SVM (CB-SVM), which is specifically designed for handling very large data sets. CB-SVM applies a hierarchical micro-clustering algorithm that scans the entire data set only once to provide an SVM with high quality samples that carry the statistical summaries of the data such that the summaries maximize the benefit of learning the SVM. CB-SVM tries to generate the best SVM boundary for very large data sets given limited amount of resources. Our experiments on synthetic and real data sets show that CB-SVM is highly scalable for very large data sets while also generating high classification accuracy.
Building Support Vector Machines with Reduced Classifier Complexity
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... Support vector machines (SVMs), though accurate, are not preferred in applications requiring great classification speed, due to the number of support vectors being large. To overcome this problem we devise a primal method with the following properties: (1) it decouples the idea of basis functions ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
Support vector machines (SVMs), though accurate, are not preferred in applications requiring great classification speed, due to the number of support vectors being large. To overcome this problem we devise a primal method with the following properties: (1) it decouples the idea of basis functions from the concept of support vectors; (2) it greedily finds a set of kernel basis functions of a specified maximum size (d max ) to approximate the SVM primal cost function well; (3) it is efficient and roughly scales as O(nd max ) where n is the number of training examples; and, (4) the number of basis functions it requires to achieve an accuracy close to the SVM accuracy is usually far less than the number of SVM support vectors.
Efficient Kernel Machines Using the Improved Fast Gauss Transform
- Advances in Neural Information Processing Systems 17
, 2004
"... The computation required for kernel machines with N training samples is O(N ). Such computational complexity is significant even for moderate size problems and is prohibitive for large datasets. We present an approximation technique based on the improved fast Gauss transform to reduce the com ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
The computation required for kernel machines with N training samples is O(N ). Such computational complexity is significant even for moderate size problems and is prohibitive for large datasets. We present an approximation technique based on the improved fast Gauss transform to reduce the computation to O(N). We also give an error bound for the approximation, and provide experimental results on the UCI datasets.
A Feature Selection Newton Method for Support Vector Machine Classification
- Computational Optimization and Applications
, 2002
"... A fast Newton method, that suppresses input space features, is proposed for a linear programming formulation of support vector machine classifiers. The proposed stand-alone method can handle classification problems in very high dimensional spaces, such as 28,032 dimensions, and generates a classifie ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
A fast Newton method, that suppresses input space features, is proposed for a linear programming formulation of support vector machine classifiers. The proposed stand-alone method can handle classification problems in very high dimensional spaces, such as 28,032 dimensions, and generates a classifier that depends on very few input features, such as 7 out of the original 28,032. The method can also handle problems with a large number of data points and requires no specialized linear programming packages but merely a linear equation solver. For nonlinear kernel classifiers, the method utilizes a minimal number of kernel functions in the classifier that it gener- ates.
A Study on Reduced Support Vector Machines
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2003
"... Recently the Reduced Support Vector Machine (RSVM) was proposed as an alternate of the standard SVM. Motivated by resolving the difficulty on handling large data sets using SVM with nonlinear kernels, it preselects a subset of data as support vectors and solves a smaller optimization problem. How ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
Recently the Reduced Support Vector Machine (RSVM) was proposed as an alternate of the standard SVM. Motivated by resolving the difficulty on handling large data sets using SVM with nonlinear kernels, it preselects a subset of data as support vectors and solves a smaller optimization problem. However, several issues of its practical use have not been fully discussed yet. For example, we do not know if it possesses comparable generalization ability as the standard SVM. In addition, we would like to see for how large problems RSVM outperforms SVM on training time. In this paper we show that the RSVM formulation is already in a form of linear SVM and discuss four RSVM implementations. Experiments indicate that in general the test accuracy of RSVM are a little lower than that of the standard SVM. In addition, for problems with up to tens of thousands of data, if the percentage of support vectors is not high, existing implementations for SVM is quite competitive on the training time. Thus, from this empirical study, RSVM will be mainly useful for either larger problems or those with many support vectors. Experiments in this paper also serve as comparisons of (1) different implementations for linear SVM; and (2) standard SVM using linear and quadratic cost functions.

