Results 1 -
4 of
4
Proximal support vector machine classifiers
- Proceedings KDD-2001: Knowledge Discovery and Data Mining
, 2001
"... Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the ..."
Abstract
-
Cited by 81 (11 self)
- Add to MetaCart
Abstract—A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness. Index Terms—Support vector machines, proximal classification, generalized eigenvalues. 1
Data Selection for Support Vector Machine Classifiers
- In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2000
"... The problem of extracting a minimal number of data points from a large dataset, in order to generate a support vector machine (SVM) classifier, is formulated as a concave min- imization problem and solved by a finite number of linear programs. This minimal set of data points, which is the smallest n ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
The problem of extracting a minimal number of data points from a large dataset, in order to generate a support vector machine (SVM) classifier, is formulated as a concave min- imization problem and solved by a finite number of linear programs. This minimal set of data points, which is the smallest number of support vectors that completely characterize a separating plane classifier, is considerably smaller than that required by a standard 1-norm support vector machine with or without feature selection. The proposed approach also incorporates a feature selection procedure that results in a minimal number of input features used by the classifier. Tenfold cross validation gives as good or better test results using the proposed minimal support vector ma- chine (MSVM) classifier based on the smaller set of data points compared to a standard 1-norm support vector machine classifier. The reduction in data points used by an MSVM classifier over those used by a 1-norm SVM classifier averaged 66% on seven public datasets and was as high as 81%. This makes MSVM a useful incremental classification tool which maintains only a small fraction of a large dataset before merging and processing it with new incoming data.
Massive Support Vector Regression
- Data Mining Institute, Computer Sciences Department, University of Wisconsin
, 1999
"... The problem of tolerant data fitting by a nonlinear surface, induced by a kernel-based support vector machine [19], is formulated as a linear program with fewer number of variables than that of other linear programming formulations [17]. A generalization of the linear programming chunking algorithm ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
The problem of tolerant data fitting by a nonlinear surface, induced by a kernel-based support vector machine [19], is formulated as a linear program with fewer number of variables than that of other linear programming formulations [17]. A generalization of the linear programming chunking algorithm [1] for arbitrary kernels [10] is implemented for solving problems with very large datasets wherein chunking is performed on both data points and problem variables. The proposed approach tolerates a small error, which is adjusted parametrically, while fitting the given data. This leads to improved fitting of noisy data as demonstrated computationally. Comparative numerical results indicate an average time reduction as high as 26.0%, with a maximal time reduction of 79.7%. Additionally, linear programs with as many as 16,000 data points and more than a billion nonzero matrix elements are solved. 1 Introduction Tolerating a small error in fitting a given set of data, i.e. disregarding errors ...
A Comprehensive Overview of Basic Clustering Algorithms
"... This paper attempts to cover the main algorithms used for clustering, with a brief and simple description of each. For each algorithm, I have selected the most common version to represent the entire family. Advantages and drawbacks are commented for each case, and the general idea of possible sub-va ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper attempts to cover the main algorithms used for clustering, with a brief and simple description of each. For each algorithm, I have selected the most common version to represent the entire family. Advantages and drawbacks are commented for each case, and the general idea of possible sub-variations is presented

