Results 1  10
of
12
Semisupervised support vector machines
 Advances in Neural Information Processing Systems
, 1998
"... We introduce a semisupervised support vector machine (S 3 VM) method. Given a training set of labeled data and a working set of unlabeled data, S 3 VM constructs a support vector machine using both the training and working sets. We use S 3 VM to solve the transduction problem using overall risk min ..."
Abstract

Cited by 173 (7 self)
 Add to MetaCart
We introduce a semisupervised support vector machine (S 3 VM) method. Given a training set of labeled data and a working set of unlabeled data, S 3 VM constructs a support vector machine using both the training and working sets. We use S 3 VM to solve the transduction problem using overall risk minimization (ORM) posed by Vapnik. The transduction problem is to estimate the value of a classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification function at all possible values and then using the fixed function to deduce the classes of the working set data. We propose a general S 3 VM model that minimizes both the misclassification error and the function capacity based on all the available data. We show how the S 3 VM model for 1norm linear support vector machines can be converted to a mixedinteger program and then solved exactly using integer programming. Results of S 3 VM and the standard 1norm support vector machine approach are compared on eleven data sets. Our computational results support the statistical learning theory results showing that incorporating working data improves generalization when insufficient training information is available. In every case, S 3 VM either improved or showed no significant difference in generalization compared to the traditional approach.
Support Vector Machines: Hype or Hallelujah?
 SIGKDD Explorations
, 2003
"... Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification ..."
Abstract

Cited by 81 (0 self)
 Add to MetaCart
Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification problem is used to investigate the basic concepts behind SVMs and to examine their strengths and weaknesses from a data mining perspective. While this overview is not comprehensive, it does provide resources for those interested in further exploring SVMs.
Enlarging the Margins in Perceptron Decision Trees
, 2000
"... Capacity control in perceptron decision trees is typically performed by controlling their size. We prove that other quantities can be as relevant to reduce their flexibility and combat overfitting. In particular, we provide an upper bound on the generalization error which depends both on the size of ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
Capacity control in perceptron decision trees is typically performed by controlling their size. We prove that other quantities can be as relevant to reduce their flexibility and combat overfitting. In particular, we provide an upper bound on the generalization error which depends both on the size of the tree and on the margin of the decision nodes. So enlarging the margin in perceptron decision trees will reduce the upper bound on generalization error. Based on this analysis, we introduce three new algorithms, which can induce large margin perceptron decision trees. To assess the effect of the large margin bias, OC1 (Journal of Artificial Intelligence Research, 1994, 2, 132.) of Murthy, Kasif, and Salzberg, a wellknown system for inducing perceptron decision trees, is used as the baseline algorithm. An extensive experimental study on real world data showed that all three new algorithms perform better or at least not significantly worse than OC1 on almost every dataset with only one exception. OC1 performed worse than the best marginbased method on every dataset.
Data Discrimination via Nonlinear Generalized Support Vector Machines
 Complementarity: Applications, Algorithms and Extensions
, 1999
"... The main purpose of this paper is to show that new formulations of support vector machines can generate nonlinear separating surfaces which can discriminate between elements of a given set better than a linear surface. The principal approach used is that of generalized support vector machines (GSVMs ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
The main purpose of this paper is to show that new formulations of support vector machines can generate nonlinear separating surfaces which can discriminate between elements of a given set better than a linear surface. The principal approach used is that of generalized support vector machines (GSVMs) which employ possibly indefinite kernels [17]. The GSVM training procedure is carried out by either the simple successive overrelaxation (SOR) [18] iterative method or by linear programming. This novel combination of powerful support vector machines [24, 5] with the highly effective SOR computational algorithm [15, 16, 14] or with linear programming allows us to use a nonlinear surface to discriminate between elements of a dataset that belong to one of two categories. Numerical results on a number of datasets show improved testing set correctness, by as much as a factor of two, when comparing the nonlinear GSVM surface to a linear separating surface. 1 Introduction A very simple convex qu...
Optimization Methods In Massive Datasets
"... We describe the role of generalized support vector machines in separating massive and complex data using arbitrary nonlinear kernels. Feature selection that improves generalization is implemented via an effective procedure that utilizes a polyhedral norm or a concave function minimization. Massive d ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We describe the role of generalized support vector machines in separating massive and complex data using arbitrary nonlinear kernels. Feature selection that improves generalization is implemented via an effective procedure that utilizes a polyhedral norm or a concave function minimization. Massive data is separated using a linear programming chunking algorithm as well as a successive overrelaxation algorithm, each of which is capable of processing data with millions of points. 1 2 1. INTRODUCTION We address here the problem of classifying data in ndimensional real (Euclidean) space R n into one of two disjoint nite point sets (i.e. classes). The support vector machine (SVM) approach to classication [57, 2, 25, 58, 13, 54, 55] attempts to separate points belonging to two given sets in R n by a nonlinear surface, often only implicitly dened by a kernel function. Since the nonlinear surface in R n is typically linear in its parameters, it can be represented as a linear func...
Modeling Languages and Condor: Metacomputing for Optimization
, 1998
"... A generic framework for utilizing the computational resources provided by a metacomputer to concurrently solve several optimization problems generated by a modeling language is postulated. A mechanism using the Condor resource manager and the AMPL and GAMS languages is developed and applied to a ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
A generic framework for utilizing the computational resources provided by a metacomputer to concurrently solve several optimization problems generated by a modeling language is postulated. A mechanism using the Condor resource manager and the AMPL and GAMS languages is developed and applied to a technique for solving a mixed integer programming formulation of the feature selection problem. Due to the method's computational requirements, the ability to perform optimizations in parallel is necessary in order to obtain results within a reasonable amount of time. We provide details about our simple and easy to use tool and implementation so that other modelers with applications generating many independent mathematical programs can take advantage of it to significantly reduce duration. 1 Introduction One branch of the machine learning community, those researching supervised learning [3, 18, 19], attempts to construct a process based upon historical data for the purpose of forec...
Application of Support Vector Machines In Bioinformatics
, 2002
"... Recently a new learning method called support vector machines (SVM) has shown comparable or better results than neural networks on some applications. In this thesis we exploit the possibility of using SVM for three important issues of bioinformatics: the prediction of protein secondary structure, mu ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Recently a new learning method called support vector machines (SVM) has shown comparable or better results than neural networks on some applications. In this thesis we exploit the possibility of using SVM for three important issues of bioinformatics: the prediction of protein secondary structure, multiclass protein fold recognition, and the prediction of human signal peptide cleavage sites. By using similar data, we demonstrate that SVM can easily achieve comparable accuracy as using neural networks. Therefore, in the future it is a promising direction to apply SVM on more bioinformatics applications.
Food Bytes: Intelligent Systems in the Food Industry
 British Food Journal
, 2002
"... Computers have transformed the design of everything from cars to coffee cups. Now the food industry faces the same revolution, with intelligent computer models being used in the design, production and marketing of food products. The combined market capitalisation of the world’s biggest food, cosmeti ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Computers have transformed the design of everything from cars to coffee cups. Now the food industry faces the same revolution, with intelligent computer models being used in the design, production and marketing of food products. The combined market capitalisation of the world’s biggest food, cosmetics, tobacco, clothing and consumer electronics companies is $2 trillion, forming 16% of the world’s 500 richest companies (Financial Times Survey 1999). Many of these “fastmoving consumer goods ” companies now apply intelligent computer models to the design, production and marketing of their products. Manufacturers aim to develop and produce highvolumes of these commodities with minimum costs, maximum consumer appeal, and of course, maximum profits. Products have limited lifetimes following the fashions of the consumerdriven marketplace. With food and drink, little is known about many of the underlying characteristics and processes: why do some apples taste better than others? How “crunchy ” is the perfect apple? Product development and marketing must therefore be rapid, flexible and use raw data alongside existing expert knowledge. Intelligent systems such as neural networks, fuzzy logic and genetic algorithms, mimic human skills such as the ability to learn from incomplete information, to adapt to changing circumstances, to explain their decisions and to cope with novel situations. These systems are being used to tackle a growing range of problems, from credit card fraud detection and stock market prediction to medical diagnosis and weather forecasting. This paper introduces intelligent systems and highlights their use in all aspects of the food and drink industry, from ingredient selection, through product design and manufacture, to packaging design and marketing.
Reduction techniques for training support vector machines
, 2002
"... Recently two kinds of reduction techniques which aimed at saving training time for SVM problems with nonlinear kernels were proposed. Instead of solving the standard SVM formulation, these methods explicitly alter the SVM formulation, and solutions for them are used to classify data. The first appro ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Recently two kinds of reduction techniques which aimed at saving training time for SVM problems with nonlinear kernels were proposed. Instead of solving the standard SVM formulation, these methods explicitly alter the SVM formulation, and solutions for them are used to classify data. The first approach, reduced support vector machine (RSVM) [21], preselects a subset of data as support vectors and solves a smaller optimization problem. The second approach [11] uses imcomplete Cholesky factorization (ICF) to obtain a lowrank approximation of the kernel matrix. Therefore, an easier optimization problem is obtained. We find that several issues of their practical uses have not been fully discussed yet. For example, we do not know if they possess comparable generalization ability as the standard SVM. In addition, we would like to see for how large problems they outperform SVM on training time. In this thesis we show that the formulation of each technique is already in a form of linear SVM and discuss several suitable implementations. Experiments indicate that in general the test accuracy of both techniques is a little lower than that of the standard SVM. In addition, for problems with up to tens of thousands of data, if the percentage of support vectors is not high, existing implementations for SVM is quite competitive on the training time. Thus, the two techniques will be mainly useful for either larger problems or those with many support vectors. Experiments in this thesis also serve as comparisons of (1) different implementations for linear SVM; (2) standard SVM using linear and quadratic cost functions; and (3) two ICF algorithms for positive definite dense matrices. ii