Results 1 - 10
of
10
Probability Estimates for Multi-class Classification by Pairwise Coupling
- Journal of Machine Learning Research
, 2003
"... Pairwise coupling is a popular multi-class classification method that combines together all pairwise comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement. ..."
Abstract
-
Cited by 114 (1 self)
- Add to MetaCart
Pairwise coupling is a popular multi-class classification method that combines together all pairwise comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement.
Working set selection using the second order information for training SVM
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast convergence. Theoretical properties such ..."
Abstract
-
Cited by 94 (6 self)
- Add to MetaCart
Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast convergence. Theoretical properties such as linear convergence are established. Experiments demonstrate that the proposed method is faster than existing selection methods using first order information.
A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods
, 2003
"... The sigmoid kernel was quite popular for support vector machines due to its origin from neural networks. However, as the kernel matrix may not be positive semidefinite (PSD), it is not widely used and the behavior is unknown. In this paper, we analyze such non-PSD kernels through the point of view o ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
The sigmoid kernel was quite popular for support vector machines due to its origin from neural networks. However, as the kernel matrix may not be positive semidefinite (PSD), it is not widely used and the behavior is unknown. In this paper, we analyze such non-PSD kernels through the point of view of separability. Based on the investigation of parameters in different ranges, we show that for some parameters, the kernel matrix is conditionally positive definite (CPD), a property which explains its practical viability. Experiments are given to illustrate our analysis. Finally, we discuss how to solve the non-convex dual problems by SMO-type decomposition methods. Suitable modifications for any symmetric non-PSD kernel matrices are proposed with convergence proofs.
Combining svms with various feature selection strategies
- Taiwan University
, 2005
"... Feature selection is an important issue in many research areas. There are some reasons for selecting important features such as reducing the learning time, improving the accuracy, etc. This thesis investigates the performance of combining support vector machines (SVM) and various feature selection s ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
Feature selection is an important issue in many research areas. There are some reasons for selecting important features such as reducing the learning time, improving the accuracy, etc. This thesis investigates the performance of combining support vector machines (SVM) and various feature selection strategies. The first part of the thesis mainly describes the existing feature selection methods and our experience on using those methods to attend a competition. The second part studies more feature selection strategies using the SVM. ii �ì��¬¡÷ � ��å�ç¢�ß��� � selection)��¥ì����£��È�� ����È������Ú���£����æÁ ç��£�����û�� ì�Öù�¡�È��(feature é£�æÁ©Â����℄���� � �Ü � ����Æ���È��℄�¡��û���℄�ø�¢�§���� �(Support Vector Machine) iii
A Study on Reduced Support Vector Machines
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2003
"... Recently the Reduced Support Vector Machine (RSVM) was proposed as an alternate of the standard SVM. Motivated by resolving the difficulty on handling large data sets using SVM with nonlinear kernels, it preselects a subset of data as support vectors and solves a smaller optimization problem. How ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
Recently the Reduced Support Vector Machine (RSVM) was proposed as an alternate of the standard SVM. Motivated by resolving the difficulty on handling large data sets using SVM with nonlinear kernels, it preselects a subset of data as support vectors and solves a smaller optimization problem. However, several issues of its practical use have not been fully discussed yet. For example, we do not know if it possesses comparable generalization ability as the standard SVM. In addition, we would like to see for how large problems RSVM outperforms SVM on training time. In this paper we show that the RSVM formulation is already in a form of linear SVM and discuss four RSVM implementations. Experiments indicate that in general the test accuracy of RSVM are a little lower than that of the standard SVM. In addition, for problems with up to tens of thousands of data, if the percentage of support vectors is not high, existing implementations for SVM is quite competitive on the training time. Thus, from this empirical study, RSVM will be mainly useful for either larger problems or those with many support vectors. Experiments in this paper also serve as comparisons of (1) different implementations for linear SVM; and (2) standard SVM using linear and quadratic cost functions.
Generalized bradley-terry models and multi-class probability estimates
- Journal of Machine Learning Research
"... Editor: The Bradley-Terry model for obtaining individual skill from paired comparisons has been popular in many areas. In machine learning, this model is related to multi-class probability estimates by coupling all pairwise classification results. Error correcting output codes (ECOC) are a general f ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Editor: The Bradley-Terry model for obtaining individual skill from paired comparisons has been popular in many areas. In machine learning, this model is related to multi-class probability estimates by coupling all pairwise classification results. Error correcting output codes (ECOC) are a general framework to decompose a multi-class problem to several binary problems. To obtain probability estimates under this framework, this paper introduces a generalized Bradley-Terry model in which paired individual comparisons are extended to paired team comparisons. We propose a simple algorithm with convergence proofs to solve the model and obtain individual skill. Experiments on synthetic and real data demonstrate that the algorithm is useful for obtaining multi-class probability estimates. Moreover, we discuss four extensions of the proposed model: 1) weighted individual skill, 2) home-field advantage, 3) ties, and 4) comparisons with more than two teams. Keywords: Bradley-Terry model, Probability estimates, Error correcting output codes, Support Vector Machines
A tutorial on ν-Support Vector Machines
- APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY
, 2005
"... We briefly describe the main ideas of statistical learning theory, support vector machines (SVMs), and kernel feature spaces. We place particular emphasis on a description of the so-called n-SVM, including details of the algorithm and its implementation, theoretical results, and practical applicatio ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
We briefly describe the main ideas of statistical learning theory, support vector machines (SVMs), and kernel feature spaces. We place particular emphasis on a description of the so-called n-SVM, including details of the algorithm and its implementation, theoretical results, and practical applications.
Simple probabilistic predictions for support vector regression
, 2004
"... Support vector regression (SVR) has been popular in the past decade, but it provides only an estimated target value instead of predictive probability intervals. Many work have addressed this issue but sometimes the SVR formula must be modified. This paper presents a rather simple and direct approach ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Support vector regression (SVR) has been popular in the past decade, but it provides only an estimated target value instead of predictive probability intervals. Many work have addressed this issue but sometimes the SVR formula must be modified. This paper presents a rather simple and direct approach to construct such intervals. We assume that the conditional distribution of the target value depends on its input only through the predicted value, and propose to model this distribution by simple functions. Experiments show that the proposed approach gives predictive intervals with competitive coverages with Bayesian SVR methods. I.
A Practical Guide to Support Vector Classification Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin
, 2003
"... Support vector machine (SVM) is a popular technique for classification. ..."
Decomposition Methods for . . .
"... In this paper, we show that decomposition methods with alpha seeding are extremely useful for solving a sequence of linear SVMs with more data than attributes. This strategy is motivated from (Keerthi and Lin 2003) which proved that for an SVM with data not linearly separable, after C is large enoug ..."
Abstract
- Add to MetaCart
In this paper, we show that decomposition methods with alpha seeding are extremely useful for solving a sequence of linear SVMs with more data than attributes. This strategy is motivated from (Keerthi and Lin 2003) which proved that for an SVM with data not linearly separable, after C is large enough, the dual solutions are at the same face. We explain why a direct use of decomposition methods for linear SVMs is sometimes very slow and then analyze why alpha seeding is much more effective for linear than nonlinear SVMs. We also conduct comparisons with other methods which are efficient for linear SVMs, and demonstrate the effectiveness of alpha seeding techniques for helping the model selection.

