DMCA
Semi-supervised support vector machines (1998)
Cached
Download Links
- [papers.nips.cc]
- [www.cs.columbia.edu]
- [www1.cs.columbia.edu]
- [www.cs.wustl.edu]
- DBLP
Other Repositories/Bibliography
Venue: | In Proc. NIPS |
Citations: | 223 - 6 self |
Citations
13212 | Statistical Learning Theory
- Vapnik
- 1998
(Show Context)
Citation Context ...to assign class labels to thesworking set such that the "best" support vector machine (SVM) is constructed.sIf the working set is empty the method becomes the standl1rd SVM approach tosclassification =-=[20, 9, 8]-=-. If the training set is empty, then the method becomes asform of unsupervised learning. Semi-supervised learning occurs when both trainingsand working sets are nonempty. Semi-supervised learning for ... |
3701 | Support-vector networks
- Cortes, Vapnik
- 1995
(Show Context)
Citation Context ... to assign class labels to the working set such that the "best" support vector machine (SVM) is constructed. If the working set is empty the method becomes the standard SVM approach to class=-=ification [20, 9, 8]-=-. If the training set is empty, then the method becomes a form of unsupervised learning. Semi-supervised learning occurs when both training and working sets are nonempty. Semi-supervised learning for ... |
3387 | A tutorial on support vector machines for pattern recognition
- Burges
- 1998
(Show Context)
Citation Context ... to assign class labels to the working set such that the "best" support vector machine (SVM) is constructed. If the working set is empty the method becomes the standard SVM approach to class=-=ification [20, 9, 8]-=-. If the training set is empty, then the method becomes a form of unsupervised learning. Semi-supervised learning occurs when both training and working sets are nonempty. Semi-supervised learning for ... |
963 |
Estimation of Dependencies Based on Empirical Data
- Vapnik
- 1982
(Show Context)
Citation Context ...yield improvements when the training sets are small or when there is a significant deviation between the training and working set subsamples of the total population. Indeed,the theoretical results in =-=[19]-=- support these hypotheses. In Section 2, we briefly review the standard SVM model for structural risk minimization. According to the principles of structural risk minimization, SVM minimize both the e... |
854 |
UCI repository of machine learning databases
- Murphy, Aha
- 1995
(Show Context)
Citation Context .... Note that a much larger and clearer separation margin is found. These computational solutions are identical to those presented in [19]. We also tested S 3 VM on ten real-world data sets (eight from =-=[14]-=- and the bright and dim galaxy sets from [15]). There have been many algorithms applied successfully to these problems without incorporate working set information. Thus it was not clear a priori that ... |
527 |
AMPL: A Modeling Language for Mathematical Programming
- Fourer, Gay, et al.
- 1993
(Show Context)
Citation Context ...e found using CPLEX or other commercial mixedsinteger programming codes [10] provided computer resources are sufficient for thesproblem size. Using the mathematical programming modeling language AMPL =-=[11]-=-,swe were able to express the problem in thirty lines of code plus a data file and solvesit using CPLEX.s4 S3VM and Overall Risk MinimizationsAn integer S3YM can be used to solve the Overall Risk Mini... |
483 |
AMPL: A Modeling Language for
- Fourer, Gay, et al.
- 1993
(Show Context)
Citation Context ... be found using CPLEX or other commercial mixed integer programming codes [10] provided computer resources are su#cient for the problem size. Using the mathematical programming modeling language AMPL =-=[11]-=-, we were able to express the problem in thirty lines of code plus a data file and solve it using CPLEX. 4 S 3 VM and Overall Risk Minimization An integer S 3 VMcan be used to solve the Overall Risk M... |
461 | Sequential minimal optimization: A fast algorithm for training support vector machines. - Platt - 1998 |
261 | Feature selection via concave minimization and support vector machines
- Bradley, Mangasarian
- 1998
(Show Context)
Citation Context ...extended to handle nonlinear discrimination using kernel functions [8, 12]. Empirical comparisons of the approaches have not found any significant di#erence in generalization between the formulations =-=[5, 7, 3, 12]-=-. 3 Semi-supervised support vector machines To formulate the S 3 VM , we start with either SVM formulation, (4) or (5), and then add two constraints for each point in the working set. One constraint c... |
239 | Robust linear programming discrimination of two linearly inseparable sets,”
- Bennett, Mangasarian
- 1992
(Show Context)
Citation Context ... (RLP) approach to SVM is identical to GOP except the margin term is changed from the 2-norm #w# 2 to the 1-norm, #w# 1 = # n j=1 |w j |. The problem becomes the following robust linear program (RLP) =-=[2, 7, 1]-=-: min w,b,s,# C # # i=1 # i + n # j=1 s j s.t. y i [w x i - b] + # i # 1 # i # 0, i = 1, . . . , # -s jsw jss j , j = 1, . . . , n. (5) The RLP formulation is a useful variation of SVM with some nice ... |
222 | Support Vector Machines: Training and Applications - Osuna - 1998 |
125 |
Theory of Pattern Recognition
- Vapnik, Chervonenkis
- 1974
(Show Context)
Citation Context ...y i [w x i - b] + # i # 1 # i # 0, i = 1, . . . , # (4) where C > 0 is a fixed penalty parameter. The capacity control provided by the margin maximization is imperative to achieve good generalization =-=[21, 19]-=-. The Robust Linear Programming (RLP) approach to SVM is identical to GOP except the margin term is changed from the 2-norm #w# 2 to the 1-norm, #w# 1 = # n j=1 |w j |. The problem becomes the followi... |
61 | Massive data discrimination via linear support vector machines. - Bradley, Mangasarian - 2000 |
50 | Partially supervised clustering for image segmentation.
- Bensaid, Hall, et al.
- 1996
(Show Context)
Citation Context ...learning for problems with small training sets and large working sets is a form of semi-supervised clustering. There are successful semi-supervised algorithms for k-means and fuzzy c-means clustering =-=[4, 18]-=-. Clustering is a potential application for S 3 VM as well. When the training set is large relative to the working set, S 3 VM can be viewed as a method for solving the transduction problem according ... |
46 |
Automated star/galaxy discrimination with neural networks
- Odewahn, Stockwell, et al.
- 1992
(Show Context)
Citation Context ...ion margin is found. These computational solutions are identical to those presented in [19]. We also tested S 3 VM on ten real-world data sets (eight from [14] and the bright and dim galaxy sets from =-=[15]-=-). There have been many algorithms applied successfully to these problems without incorporate working set information. Thus it was not clear a priori that S 3 VM would improve generalization on these ... |
28 | Geometry in Learning
- Bennett, Bredensteiner
- 1996
(Show Context)
Citation Context ... (RLP) approach to SVM is identical to GOP except the margin term is changed from the 2-norm #w# 2 to the 1-norm, #w# 1 = # n j=1 |w j |. The problem becomes the following robust linear program (RLP) =-=[2, 7, 1]-=-: min w,b,s,# C # # i=1 # i + n # j=1 s j s.t. y i [w x i - b] + # i # 1 # i # 0, i = 1, . . . , # -s jsw jss j , j = 1, . . . , n. (5) The RLP formulation is a useful variation of SVM with some nice ... |
25 | Parsimonious least norm approximation.
- Bradley, Mangasarian, et al.
- 1997
(Show Context)
Citation Context .... , # -s jsw jss j , j = 1, . . . , n. (5) The RLP formulation is a useful variation of SVM with some nice characteristics. The 1-norm weight reduction still provides capacity control. The results in =-=[13]-=- can be used to show that minimizing #w# 1 corresponds to maximizing the separation margin using the infinity norm. Statistical learning theory could potentially be extended to incorporate alternative... |
18 | Feature minimization within decision trees
- Bredensteiner, Bennett
- 1998
(Show Context)
Citation Context ... (RLP) approach to SVM is identical to GOP except the margin term is changed from the 2-norm #w# 2 to the 1-norm, #w# 1 = # n j=1 |w j |. The problem becomes the following robust linear program (RLP) =-=[2, 7, 1]-=-: min w,b,s,# C # # i=1 # i + n # j=1 s j s.t. y i [w x i - b] + # i # 1 # i # 0, i = 1, . . . , # -s jsw jss j , j = 1, . . . , n. (5) The RLP formulation is a useful variation of SVM with some nice ... |
15 | On Support Vector to Decision Trees for Database Marketing
- BENNETT
- 1998
(Show Context)
Citation Context ...extended to handle nonlinear discrimination using kernel functions [8, 12]. Empirical comparisons of the approaches have not found any significant di#erence in generalization between the formulations =-=[5, 7, 3, 12]-=-. 3 Semi-supervised support vector machines To formulate the S 3 VM , we start with either SVM formulation, (4) or (5), and then add two constraints for each point in the working set. One constraint c... |
9 |
programming support vectormachines for pattern classification and regression estimation; and the SR algorithm: improving speed and tightness
- Frieb, Harrison, et al.
- 1998
(Show Context)
Citation Context ...r benefit of RLP over GOP is that it can be solved using linear programming instead of quadratic programming. Both approaches can be extended to handle nonlinear discrimination using kernel functions =-=[8, 12]-=-. Empirical comparisons of the approaches have not found any significant di#erence in generalization between the formulations [5, 7, 3, 12]. 3 Semi-supervised support vector machines To formulate the ... |
4 |
Tumor volume measurements using supervised and semi-supervised mri segmentation
- Vaidyanathan, Velthuizen, et al.
- 1994
(Show Context)
Citation Context ...learning for problems with small training sets and large working sets is a form of semi-supervised clustering. There are successful semi-supervised algorithms for k-means and fuzzy c-means clustering =-=[4, 18]-=-. Clustering is a potential application for S 3 VM as well. When the training set is large relative to the working set, S 3 VM can be viewed as a method for solving the transduction problem according ... |