Results 1 - 10
of
42
Everything Old Is New Again: A Fresh Look at Historical Approaches
- in Machine Learning. PhD thesis, MIT
, 2002
"... 2 Everything Old Is New Again: A Fresh Look at Historical ..."
Abstract
-
Cited by 68 (5 self)
- Add to MetaCart
2 Everything Old Is New Again: A Fresh Look at Historical
Support Vector Machines: Hype or Hallelujah?
- SIGKDD Explorations
, 2003
"... Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification ..."
Abstract
-
Cited by 65 (0 self)
- Add to MetaCart
Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification problem is used to investigate the basic concepts behind SVMs and to examine their strengths and weaknesses from a data mining perspective. While this overview is not comprehensive, it does provide resources for those interested in further exploring SVMs.
Core vector machines: Fast SVM training on very large data sets
- Journal of Machine Learning Research
, 2005
"... Standard SVM training has O(m 3) time and O(m 2) space complexities, where m is the training set size. It is thus computationally infeasible on very large data sets. By observing that practical SVM implementations only approximate the optimal solution by an iterative strategy, we scale up kernel met ..."
Abstract
-
Cited by 61 (11 self)
- Add to MetaCart
Standard SVM training has O(m 3) time and O(m 2) space complexities, where m is the training set size. It is thus computationally infeasible on very large data sets. By observing that practical SVM implementations only approximate the optimal solution by an iterative strategy, we scale up kernel methods by exploiting such “approximateness ” in this paper. We first show that many kernel methods can be equivalently formulated as minimum enclosing ball (MEB) problems in computational geometry. Then, by adopting an efficient approximate MEB algorithm, we obtain provably approximately optimal solutions with the idea of core sets. Our proposed Core Vector Machine (CVM) algorithm can be used with nonlinear kernels and has a time complexity that is linear in m and a space complexity that is independent of m. Experiments on large toy and realworld data sets demonstrate that the CVM is as accurate as existing SVM implementations, but is much faster and can handle much larger data sets than existing scale-up methods. For example, CVM with the Gaussian kernel produces superior results on the KDDCUP-99 intrusion detection data, which has about five million training patterns, in only 1.4 seconds on a 3.2GHz Pentium–4 PC.
A New Approximate Maximal Margin Classification Algorithm
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2001
"... A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data wi ..."
Abstract
-
Cited by 60 (5 self)
- Add to MetaCart
A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data with p-norm margin larger than (1 ) , where is the (normalized) p-norm margin of the data. alma p avoids quadratic (or higher-order) programming methods. It is very easy to implement and is as fast as on-line algorithms, such as Rosenblatt's Perceptron algorithm. We performed extensive experiments on both real-world and artificial datasets. We compared alma 2 (i.e., alma p with p = 2) to standard Support vector Machines (SVM) and to two incremental algorithms: the Perceptron algorithm and Li and Long's ROMMA. The accuracy levels achieved by alma 2 are superior to those achieved by the Perceptron algorithm and ROMMA, but slightly inferior to SVM's. On the other hand, alma 2 is quite faster and easier to implement than standard SVM training algorithms. When learning sparse target vectors, alma p with p > 2 largely outperforms Perceptron-like algorithms, such as alma 2 .
The Relaxed Online Maximum Margin Algorithm
- Machine Learning
, 2000
"... We describe a new incremental algorithm for training linear threshold functions: the Relaxed Online Maximum Margin Algorithm, or ROMMA. ROMMA can be viewed as an approximation to the algorithm that repeatedly chooses the hyperplane that classifies previously seen examples correctly with the maximum ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
We describe a new incremental algorithm for training linear threshold functions: the Relaxed Online Maximum Margin Algorithm, or ROMMA. ROMMA can be viewed as an approximation to the algorithm that repeatedly chooses the hyperplane that classifies previously seen examples correctly with the maximum margin. It is known that such a maximum-margin hypothesis can be computed by minimizing the length of the weight vector subject to a number of linear constraints. ROMMA works by maintaining a relatively simple relaxation of these constraints that can be eciently updated. We prove a mistake bound for ROMMA that is the same as that proved for the perceptron algorithm. Our analysis implies that the more computationally intensive maximum-margin algorithm also satis es this mistake bound; this is the rst worst-case performance guarantee for this algorithm. We describe some experiments using ROMMA and a variant that updates its hypothesis more aggressively as batch algorithms to recognize handwr...
Duality and Geometry in SVM Classifiers
- In Proc. 17th International Conf. on Machine Learning
, 2000
"... We develop an intuitive geometric interpretation of the standard support vector machine (SVM) for classification of both linearly separable and inseparable data and provide a rigorous derivation of the concepts behind the geometry. For the separable case finding the maximum margin between the ..."
Abstract
-
Cited by 45 (4 self)
- Add to MetaCart
We develop an intuitive geometric interpretation of the standard support vector machine (SVM) for classification of both linearly separable and inseparable data and provide a rigorous derivation of the concepts behind the geometry. For the separable case finding the maximum margin between the two sets is equivalent to finding the closest points in the smallest convex sets that contain each class (the convex hulls). We now extend this argument to the inseparable case by using a reduced convex hull reduced away from outliers. We prove that solving the reduced convex hull formulation is exactly equivalent to solving the standard inseparable SVM for appropriate choices of parameters. Some additional advantages of the new formulation are that the e#ect of the choice of parameters becomes geometrically clear and that the formulation may be solved by fast nearest point algorithms. By changing norms these arguments hold for both the standard 2-norm and 1-norm SVM. 1. Int...
Improvements to the SMO algorithm for SVM regression
- IEEE Trans. Neural Netw
, 2000
"... Abstract—This paper points out an important source of inefficiency ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
Abstract—This paper points out an important source of inefficiency
Training ν-Support Vector Classifiers: Theory and Algorithms
"... The ν-support vector machine (ν-SVM) for classification proposed by Schölkopf et al. has the advantage of using a parameter ν on controlling the number of support vectors. In this paper, we investigate the relation between ν-SVM and C-SVM in detail. We show that in general they are two different p ..."
Abstract
-
Cited by 27 (8 self)
- Add to MetaCart
The ν-support vector machine (ν-SVM) for classification proposed by Schölkopf et al. has the advantage of using a parameter ν on controlling the number of support vectors. In this paper, we investigate the relation between ν-SVM and C-SVM in detail. We show that in general they are two different problems with the same optimal solution set. Hence we may expect that many numerical aspects on solving them are similar. However, comparing to regular C-SVM, its formulation is more complicated so up to now there are no effective methods for solving large-scale ν-SVM. We propose a decomposition method for ν-SVM which is competitive with existing methods for C-SVM. We also discuss the behavior of ν-SVM by some numerical experiments.
Support Vector Machine Multiuser Receiver for DS-CDMA Signals in Multipath Channels
- IEEE Trans. Neural Networks
, 2000
"... The problem of constructing an adaptive multiuser detector (MUD) is considered for direct-sequence code-division multiple-access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVMs), is proposed as a method of obtaining a no ..."
Abstract
-
Cited by 23 (11 self)
- Add to MetaCart
The problem of constructing an adaptive multiuser detector (MUD) is considered for direct-sequence code-division multiple-access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVMs), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.
Training Support Vector Machine using Adaptive Clustering
- in Proc. of the 4th SIAM International Conference on Data Mining, Lake Buena
, 2004
"... Training support vector machines involves a huge optimization problem and many specially designed algorithms have been proposed. In this paper, we proposed an algorithm called ClusterSVM that accelerates the training process by exploiting the distributional properties of the training data, that is, ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
Training support vector machines involves a huge optimization problem and many specially designed algorithms have been proposed. In this paper, we proposed an algorithm called ClusterSVM that accelerates the training process by exploiting the distributional properties of the training data, that is, the natural clustering of the training data and the overall layout of these clusters relative to the decision boundary of support vector machines. The proposed algorithm first partitions the training data into several pair-wise disjoint clusters. Then, the representatives of these clusters are used to train an initial support vector machine, based on which we can approximately identify the support vectors and non-support vectors. After replacing the cluster containing only non-support vectors with its representative, the number of training data can be significantly reduced, thereby speeding up the training process. The proposed ClusterSVM has been tested against the popular training algorithm SMO on both the artificial data and the real data, and a significant speedup was observed. The complexity of ClusterSVM scales with the square of the number of support vectors and, after a further improvement, it is expected that it will scale with square of the number of non-boundary support vectors.

