Results 1  10
of
259
Gene selection for cancer classification using support vector machines
 Machine Learning
"... Abstract. DNA microarrays now permit scientists to screen thousands of genes simultaneously and determine whether those genes are active, hyperactive or silent in normal or cancerous tissue. Because these new microarray devices generate bewildering amounts of raw data, new analytical methods must ..."
Abstract

Cited by 1075 (25 self)
 Add to MetaCart
Abstract. DNA microarrays now permit scientists to screen thousands of genes simultaneously and determine whether those genes are active, hyperactive or silent in normal or cancerous tissue. Because these new microarray devices generate bewildering amounts of raw data, new analytical methods must be developed to sort out whether cancer tissues have distinctive signatures of gene expression over normal tissues or other types of cancer tissues. In this paper, we address the problem of selection of a small subset of genes from broad patterns of gene expression data, recorded on DNA microarrays. Using available training examples from cancer and normal patients, we build a classifier suitable for genetic diagnosis, as well as drug discovery. Previous attempts to address this problem select genes with correlation techniques. We propose a new method of gene selection utilizing Support Vector Machine methods based on Recursive Feature Elimination (RFE). We demonstrate experimentally that the genes selected by our techniques yield better classification performance and are biologically relevant to cancer. In contrast with the baseline method, our method eliminates gene redundancy automatically and yields better and more compact gene subsets. In patients with leukemia our method discovered 2 genes that yield zero leaveoneout error, while 64 genes are necessary for the baseline method to get the best result (one leaveoneout error). In the colon cancer database, using only 4 genes our method is 98 % accurate, while the baseline method is only 86 % accurate.
Multiple kernel learning, conic duality, and the SMO algorithm
 In Proceedings of the 21st International Conference on Machine Learning (ICML
, 2004
"... While classical kernelbased classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimiz ..."
Abstract

Cited by 450 (31 self)
 Add to MetaCart
While classical kernelbased classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimization of the coefficients of such a combination reduces to a convex optimization problem known as a quadraticallyconstrained quadratic program (QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are essential in largescale implementations of the SVM cannot be applied because the cost function is nondifferentiable. We propose a novel dual formulation of the QCQP as a secondorder cone programming problem, and show how to exploit the technique of MoreauYosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMObased algorithm is significantly more efficient than the generalpurpose interior point methods available in current optimization toolboxes. 1.
Feature selection for SVMs
 Advances in Neural Information Processing Systems 13
, 2000
"... We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leaveoneout error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard ..."
Abstract

Cited by 277 (17 self)
 Add to MetaCart
(Show Context)
We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leaveoneout error. This search can be efficiently performed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and reallife problems of face recognition, pedestrian detection and analyzing DNA microarray data. 1
Multicategory Support Vector Machines, theory, and application to the classification of microarray data and satellite radiance data
 Journal of the American Statistical Association
, 2004
"... Twocategory support vector machines (SVM) have been very popular in the machine learning community for classi � cation problems. Solving multicategory problems by a series of binary classi � ers is quite common in the SVM paradigm; however, this approach may fail under various circumstances. We pro ..."
Abstract

Cited by 261 (25 self)
 Add to MetaCart
Twocategory support vector machines (SVM) have been very popular in the machine learning community for classi � cation problems. Solving multicategory problems by a series of binary classi � ers is quite common in the SVM paradigm; however, this approach may fail under various circumstances. We propose the multicategory support vector machine (MSVM), which extends the binary SVM to the multicategory case and has good theoretical properties. The proposed method provides a unifying framework when there are either equal or unequal misclassi � cation costs. As a tuning criterion for the MSVM, an approximate leaveoneout crossvalidation function, called Generalized Approximate Cross Validation, is derived, analogous to the binary case. The effectiveness of the MSVM is demonstrated through the applications to cancer classi � cation using microarray data and cloud classi � cation with satellite radiance pro � les.
Semisupervised support vector machines
 In Proc. NIPS
, 1998
"... We introduce a semisupervised support vector machine (S3yM) method. Given a training set of labeled data and a working set of unlabeled data, S3YM constructs a support vector machine using both the training and working sets. We use S3YM to solve the transduction problem using overall risk minimiza ..."
Abstract

Cited by 221 (7 self)
 Add to MetaCart
(Show Context)
We introduce a semisupervised support vector machine (S3yM) method. Given a training set of labeled data and a working set of unlabeled data, S3YM constructs a support vector machine using both the training and working sets. We use S3YM to solve the transduction problem using overall risk minimization (ORM) posed by Yapnik. The transduction problem is to estimate the value of a classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification function at all possible values and then using the fixed function to deduce the classes of the working set data. We propose a general S3YM model that minimizes both the misclassification error and the function capacity based on all the available data. We show how the S3YM model for Inorm linear support vector machines can be converted to a mixedinteger program and then solved exactly using integer programming. Results of S3YM and the standard Inorm support vector machine approach are compared on ten data sets. Our computational results support the statistical learning theory results showing that incorporating working data improves generalization when insufficient training information is available. In every case, S3YM either improved or showed no significant difference in generalization compared to the traditional approach. SemiSupervised Support Vector Machines 369 1
1norm Support Vector Machines
 Neural Information Processing Systems
, 2003
"... The standard 2norm SVM is known for its good performance in twoclass classification. In this paper, we consider the 1norm SVM. We argue that the 1norm SVM may have some advantage over the standard 2norm SVM, especially when there are redundant noise features. We also propose an efficient alg ..."
Abstract

Cited by 171 (14 self)
 Add to MetaCart
(Show Context)
The standard 2norm SVM is known for its good performance in twoclass classification. In this paper, we consider the 1norm SVM. We argue that the 1norm SVM may have some advantage over the standard 2norm SVM, especially when there are redundant noise features. We also propose an efficient algorithm that computes the whole solution path of the 1norm SVM, hence facilitates adaptive selection of the tuning parameter for the 1norm SVM.
RSVM: Reduced support vector machines
 Data Mining Institute, Computer Sciences Department, University of Wisconsin
, 2001
"... Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variabl ..."
Abstract

Cited by 160 (19 self)
 Add to MetaCart
(Show Context)
Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variables corresponding to the 1%
An introduction to boosting and leveraging
 Advanced Lectures on Machine Learning, LNCS
, 2003
"... ..."
(Show Context)
Variable Selection Using SVMbased Criteria
, 2003
"... We propose new methods to evaluate variable subset relevance with a view to variable selection. ..."
Abstract

Cited by 121 (3 self)
 Add to MetaCart
We propose new methods to evaluate variable subset relevance with a view to variable selection.