## Working set selection using the second order information for training SVM (2005)

Venue: | JOURNAL OF MACHINE LEARNING RESEARCH |

Citations: | 158 - 10 self |

### BibTeX

@ARTICLE{Fan05workingset,

author = {Rong-en Fan and Pai-hsuen Chen and Chih-jen Lin},

title = {Working set selection using the second order information for training SVM},

journal = {JOURNAL OF MACHINE LEARNING RESEARCH},

year = {2005},

volume = {6},

pages = {1889--1918}

}

### Years of Citing Articles

### OpenURL

### Abstract

Working set selection is an important step in decomposition methods for training support vector machines (SVMs). This paper develops a new technique for working set selection in SMO-type decomposition methods. It uses second order information to achieve fast convergence. Theoretical properties such as linear convergence are established. Experiments demonstrate that the proposed method is faster than existing selection methods using first order information.

### Citations

3436 | LIBSVM: A Library for Support Vector Machines, 2001. Software available at www.csie.ntu.edu.tw/˜cjlin/libsvm - Chang, Lin |

2868 |
P.: UCI Repository of Machine Learning Databases
- Merz, Merphy
- 1996
(Show Context)
Citation Context ... 1994). We select space ga and cadata from StatLib (http://lib.stat.cmu.edu/datasets). The data sets image, diabetes, covtype, breast-cancer, and abalone are from the UCI machine learning repository (=-=Blake and Merz, 1998-=-). Problems a1a and a9a are compiled in Platt (1998) from the UCI “adult” data set. Problems w1a and w8a are also from Platt (1998). The tree data set was originally used in Bailey et al. (1993). The ... |

2171 | Support-vector networks
- Cortes, Vapnik
- 1995
(Show Context)
Citation Context ...order information. Keywords: support vector machines, decomposition methods, sequential minimal optimization, working set selection 1. Introduction Support vector machines (SVMs) (Boser et al., 1992; =-=Cortes and Vapnik, 1995-=-) are a useful classification method. Given instances xi, i = 1, . . .,l with labels yi ∈ {1, −1}, the main task in training SVMs is to solve the following quadratic optimization problem: min f(α) = α... |

1441 |
Making large-Scale SVM Learning Practical
- Joachims
- 1999
(Show Context)
Citation Context ..., and K(xi,xj) is the kernel function. The matrix Q is usually fully dense and may be too large to be stored. Decomposition methods are designed to handle such difficulties (e.g., Osuna et al., 1997; =-=Joachims, 1998-=-; Platt, 1998; Chang and Lin, 2001). Unlike most optimization methods which update the whole vector α in each step of an iterative process, the decomposition method modifies only a subset of α per ite... |

1291 | A training algorithm for optimal margin classifiers
- Boser, Guyon, et al.
(Show Context)
Citation Context ...methods using first order information. Keywords: support vector machines, decomposition methods, sequential minimal optimization, working set selection 1. Introduction Support vector machines (SVMs) (=-=Boser et al., 1992-=-; Cortes and Vapnik, 1995) are a useful classification method. Given instances xi, i = 1, . . .,l with labels yi ∈ {1, −1}, the main task in training SVMs is to solve the following quadratic optimizat... |

651 |
UCI repository of machine learning databases. URL http://www.ics.uci.edu/∼mlearn/ MLRepository.html
- Newman, Hettich, et al.
- 1998
(Show Context)
Citation Context ... 1994). We select space ga and cadata from StatLib (http://lib.stat.cmu.edu/datasets). The data sets image, diabetes, covtype, breast-cancer, and abalone are from the UCI machine learning repository (=-=Newman et al., 1998-=-). Problems a1a and a9a are compiled in Platt (1998) from the UCI “adult” data set. Problems w1a and w8a are also from Platt (1998). The tree data set was originally used in Bailey et al. (1993). The ... |

565 | Training Support Vector Machines: An Application to Face Detection
- Osuna, Osuna, et al.
- 1997
(Show Context)
Citation Context ...h Qij = yiyjK(xi,xj), and K(xi,xj) is the kernel function. The matrix Q is usually fully dense and may be too large to be stored. Decomposition methods are designed to handle such difficulties (e.g., =-=Osuna et al., 1997-=-; Joachims, 1998; Platt, 1998; Chang and Lin, 2001). Unlike most optimization methods which update the whole vector α in each step of an iterative process, the decomposition method modifies only a sub... |

325 | New support vector algorithms - Schölkopf, Smola, et al. - 2000 |

191 | K.: “Improvements to Platt’s SMO algorithm for SVM classifier design”, Neural Computation 13 - Keerthi, Shevade, et al. - 2001 |

109 | On the Convergence of the Decomposition Method for Support Vector Machines - Lin - 2001 |

20 | A study on SMO-type decomposition methods for support vector machines
- Chen, Fan, et al.
(Show Context)
Citation Context ...if Kii +Kjj −2Kij < 0. In this situation, (2) may possess multiple local minima. Moreover, there are difficulties in proving the convergence of the decomposition methods (Palagi and Sciandrone, 2005; =-=Chen et al., 2006-=-). Thus, Chen et al. (2006) proposed adding an additional term to (2)’s objective function if aij ≡ Kii + Kjj − 2Kij ≤ 0: min αi,αj 1 � αi 2 � αj � Qii Qij � � � Qij αi + (−eB + QBNα Qjj αj k N) T � �... |

17 | Polynomial-time decomposition algorithms for support vector machines - Hush, Scovel - 2003 |

13 |
Data available at http://www.ncc.up.pt/liacc/ML/statlog/datasets.html
- Michie, Spiegelhalter, et al.
- 1994
(Show Context)
Citation Context ...urther confirmed by using four large (more than 30,000 instances) classification problems. Data statistics are in Tables 1 and 3. Problems german.numer and australian are from the Statlog collection (=-=Michie et al., 1994-=-). We select space ga and cadata from StatLib (http://lib.stat.cmu.edu/datasets). The data sets image, diabetes, covtype, breast-cancer, and abalone are from the UCI machine learning repository (Newma... |

10 | IJCNN 2001 neural network competition - Prokhorov |

8 | On the convergence of a modified version of SVMlight algorithm
- Palagi, Sciandrone
(Show Context)
Citation Context ...a concave objective function if Kii +Kjj −2Kij < 0. In this situation, (2) may possess multiple local minima. Moreover, there are difficulties in proving the convergence of the decomposition methods (=-=Palagi and Sciandrone, 2005-=-; Chen et al., 2006). Thus, Chen et al. (2006) proposed adding an additional term to (2)’s objective function if aij ≡ Kii + Kjj − 2Kij ≤ 0: min αi,αj 1 � αi 2 � αj � Qii Qij � � � Qij αi + (−eB + QBN... |

5 |
A new method to select working sets for faster training for support vector machines
- Lai, Mani, et al.
- 2003
(Show Context)
Citation Context ...ally leads to faster convergence. Now (1) is a quadratic programming problem, so second order information directly relates to the decrease of the objective function. There are several attempts (e.g., =-=Lai et al., 2003-=-a,b) to find working sets based on the reduction of the objective value, but these selection methods are only heuristics without convergence proofs. Moreover, as such techniques cost more than existin... |

1 | Working Set Selection for Training SVMs - Keerthi, Shevade, et al. |

1 | Increasing the step of the Newtonian decomposition method for support vector machines - Lai, Mani, et al. |