Results 1 
7 of
7
LIBSVM: a Library for Support Vector Machines
, 2001
"... LIBSVM is a library for support vector machines (SVM). Its goal is to help users can easily use SVM as a tool. In this document, we present all its implementation details. 1 ..."
Abstract

Cited by 3412 (62 self)
 Add to MetaCart
LIBSVM is a library for support vector machines (SVM). Its goal is to help users can easily use SVM as a tool. In this document, we present all its implementation details. 1
A Comparison of Methods for Multiclass Support Vector Machines
 IEEE TRANS. NEURAL NETWORKS
, 2002
"... Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary class ..."
Abstract

Cited by 562 (15 self)
 Add to MetaCart
Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using largescale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such “alltogether” methods. We then compare their performance with three methods based on binary classifications: “oneagainstall,” “oneagainstone,” and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the “oneagainstone” and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.
A New Approximate Maximal Margin Classification Algorithm
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2001
"... A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data wi ..."
Abstract

Cited by 87 (6 self)
 Add to MetaCart
A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data with pnorm margin larger than (1 ) , where is the (normalized) pnorm margin of the data. alma p avoids quadratic (or higherorder) programming methods. It is very easy to implement and is as fast as online algorithms, such as Rosenblatt's Perceptron algorithm. We performed extensive experiments on both realworld and artificial datasets. We compared alma 2 (i.e., alma p with p = 2) to standard Support vector Machines (SVM) and to two incremental algorithms: the Perceptron algorithm and Li and Long's ROMMA. The accuracy levels achieved by alma 2 are superior to those achieved by the Perceptron algorithm and ROMMA, but slightly inferior to SVM's. On the other hand, alma 2 is quite faster and easier to implement than standard SVM training algorithms. When learning sparse target vectors, alma p with p > 2 largely outperforms Perceptronlike algorithms, such as alma 2 .
The Relaxed Online Maximum Margin Algorithm
 Machine Learning
, 2000
"... We describe a new incremental algorithm for training linear threshold functions: the Relaxed Online Maximum Margin Algorithm, or ROMMA. ROMMA can be viewed as an approximation to the algorithm that repeatedly chooses the hyperplane that classifies previously seen examples correctly with the maximum ..."
Abstract

Cited by 73 (1 self)
 Add to MetaCart
We describe a new incremental algorithm for training linear threshold functions: the Relaxed Online Maximum Margin Algorithm, or ROMMA. ROMMA can be viewed as an approximation to the algorithm that repeatedly chooses the hyperplane that classifies previously seen examples correctly with the maximum margin. It is known that such a maximummargin hypothesis can be computed by minimizing the length of the weight vector subject to a number of linear constraints. ROMMA works by maintaining a relatively simple relaxation of these constraints that can be eciently updated. We prove a mistake bound for ROMMA that is the same as that proved for the perceptron algorithm. Our analysis implies that the more computationally intensive maximummargin algorithm also satis es this mistake bound; this is the rst worstcase performance guarantee for this algorithm. We describe some experiments using ROMMA and a variant that updates its hypothesis more aggressively as batch algorithms to recognize handwr...
Classifying Gprotein coupled receptors with support vector machines
 Bioinformatics
, 2001
"... Motivation: The enormous amount of protein sequence data uncovered by genome research has increased the demand for computer software that can automate the recognition of new proteins. We discuss the relative merits of various automated methods for recognizing Gprotein coupled receptors (GPCRs), a ..."
Abstract

Cited by 69 (3 self)
 Add to MetaCart
Motivation: The enormous amount of protein sequence data uncovered by genome research has increased the demand for computer software that can automate the recognition of new proteins. We discuss the relative merits of various automated methods for recognizing Gprotein coupled receptors (GPCRs), a superfamily of cell membrane proteins. GPCRs are found in a wide range of organisms and are central to a cellular signalling network that regulates many basic physiological processes. They are the focus of a signicant amount of current pharmaceutical research because they play a key role in many diseases. However, their tertiary structures remain largely unsolved. The methods described in this paper use only primary sequence information to make their predictions. We compare a simple nearest neighbor approach (BLAST), methods based on multiple alignments generated by a statistical prole hidden Markov model, and methods, including support vector machines, that transform protein sequences into xedlength feature vectors. Results: The last is the most computationally expensive method, but our experiments show that, for those interested in annotationquality classication, the results are worth the eort. In twofold crossvalidation experiments testing recognition of GPCR subfamilies that bind a specic ligand (such as a histamine molecule), the errors per sequence at the minimum error point (MEP) were 13.7% for multiclass SVMs, 17.1% for our SVMtree method of hierarchical multiclass SVM classication, 25.5% for BLAST, 30% for prole HMMs, and 49% for classication based on nearest neighbor feature vector (kernNN). The percentage of true positives recognized before the rst false positive was 65% for both SVM methods, 13% for BLAST, 5% for prole HMMs and 4% ...
Ensembles of nested dichotomies for multiclass problems
 In Proc 21st International Conference on Machine Learning
, 2004
"... Nested dichotomies are a standard statistical technique for tackling certain polytomous classification problems with logistic regression. They can be represented as binary trees that recursively split a multiclass classification task into a system of dichotomies and provide a statistically sound wa ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Nested dichotomies are a standard statistical technique for tackling certain polytomous classification problems with logistic regression. They can be represented as binary trees that recursively split a multiclass classification task into a system of dichotomies and provide a statistically sound way of applying twoclass learning algorithms to multiclass problems (assuming these algorithms generate class probability estimates). However, there are usually many candidate trees for a given problem and in the standard approach the choice of a particular tree is based on domain knowledge that may not be available in practice. An alternative is to treat every system of nested dichotomies as equally likely and to form an ensemble classifier based on this assumption. We show that this approach produces more accurate classifications than applying C4.5 and logistic regression directly to multiclass problems. Our results also show that ensembles of nested dichotomies produce more accurate classifiers than pairwise classification if both techniques are used with C4.5, and comparable results for logistic regression. Compared to errorcorrecting output codes, they are preferable if logistic regression is used, and comparable in the case of C4.5. An additional benefit is that they generate class probability estimates. Consequently they appear to be a good generalpurpose method for applying binary classifiers to multiclass problems.
Predicting Nearly as Well as the Best Pruning of a Planar Decision Graph
 Theoretical Computer Science
, 2000
"... We design ecient online algorithms that predict nearly as well as the best pruning of a planar decision graph. We assume that the graph has no cycles. As in the previous work on decision trees, we implicitly maintain one weight for each of the prunings (exponentially many). The method works for a l ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
We design ecient online algorithms that predict nearly as well as the best pruning of a planar decision graph. We assume that the graph has no cycles. As in the previous work on decision trees, we implicitly maintain one weight for each of the prunings (exponentially many). The method works for a large class of algorithms that update its weights multiplicatively. It can also be used to design algorithms that predict nearly as well as the best convex combination of prunings. 1 Introduction Decision trees are widely used in Machine Learning. Frequently a large tree is produced initially and then this tree is pruned for the purpose of obtaining a better predictor. A pruning is produced by deleting some nodes and with them all their successors. Although there are exponentially many prunings, a recent method developed in coding theory [WST95] and machine learning [Bun92] makes it possible to (implicitly) maintain one weight per pruning. In particular Helmbold and Schapire [HS97] use this m...