Results 1  10
of
14
Semisupervised support vector machines
 Advances in Neural Information Processing Systems
, 1998
"... We introduce a semisupervised support vector machine (S 3 VM) method. Given a training set of labeled data and a working set of unlabeled data, S 3 VM constructs a support vector machine using both the training and working sets. We use S 3 VM to solve the transduction problem using overall risk min ..."
Abstract

Cited by 173 (7 self)
 Add to MetaCart
We introduce a semisupervised support vector machine (S 3 VM) method. Given a training set of labeled data and a working set of unlabeled data, S 3 VM constructs a support vector machine using both the training and working sets. We use S 3 VM to solve the transduction problem using overall risk minimization (ORM) posed by Vapnik. The transduction problem is to estimate the value of a classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification function at all possible values and then using the fixed function to deduce the classes of the working set data. We propose a general S 3 VM model that minimizes both the misclassification error and the function capacity based on all the available data. We show how the S 3 VM model for 1norm linear support vector machines can be converted to a mixedinteger program and then solved exactly using integer programming. Results of S 3 VM and the standard 1norm support vector machine approach are compared on eleven data sets. Our computational results support the statistical learning theory results showing that incorporating working data improves generalization when insufficient training information is available. In every case, S 3 VM either improved or showed no significant difference in generalization compared to the traditional approach.
Feature Selection via Mathematical Programming
, 1997
"... The problem of discriminating between two finite point sets in ndimensional feature space by a separating plane that utilizes as few of the features as possible, is formulated as a mathematical program with a parametric objective function and linear constraints. The step function that appears in th ..."
Abstract

Cited by 59 (22 self)
 Add to MetaCart
The problem of discriminating between two finite point sets in ndimensional feature space by a separating plane that utilizes as few of the features as possible, is formulated as a mathematical program with a parametric objective function and linear constraints. The step function that appears in the objective function can be approximated by a sigmoid or by a concave exponential on the nonnegative real line, or it can be treated exactly by considering the equivalent linear program with equilibrium constraints (LPEC). Computational tests of these three approaches on publicly available realworld databases have been carried out and compared with an adaptation of the optimal brain damage (OBD) method for reducing neural network complexity. One feature selection algorithm via concave minimization (FSV) reduced crossvalidation error on a cancer prognosis database by 35.4% while reducing problem features from 32 to 4. Feature selection is an important problem in machine learning [18, 15, 1...
Mathematical Programming for Data Mining: Formulations and Challenges
 INFORMS Journal on Computing
, 1998
"... This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research ch ..."
Abstract

Cited by 47 (0 self)
 Add to MetaCart
This paper is intended to serve as an overview of a rapidly emerging research and applications area. In addition to providing a general overview, motivating the importance of data mining problems within the area of knowledge discovery in databases, our aim is to list some of the pressing research challenges, and outline opportunities for contributions by the optimization research communities. Towards these goals, we include formulations of the basic categories of data mining methods as optimization problems. We also provide examples of successful mathematical programming approaches to some data mining problems. keywords: data analysis, data mining, mathematical programming methods, challenges for massive data sets, classification, clustering, prediction, optimization. To appear: INFORMS: Journal of Compting, special issue on Data Mining, A. Basu and B. Golden (guest editors). Also appears as Mathematical Programming Technical Report 9801, Computer Sciences Department, University of Wi...
Machine Learning via Polyhedral Concave Minimization
, 1996
"... Two fundamental problems of machine learning, misclassification minimization [10, 24, 18] and feature selection, [25, 29, 14] are formulated as the minimization of a concave function on a polyhedral set. Other formulations of these problems utilize linear programs with equilibrium constraints [18, 1 ..."
Abstract

Cited by 27 (12 self)
 Add to MetaCart
Two fundamental problems of machine learning, misclassification minimization [10, 24, 18] and feature selection, [25, 29, 14] are formulated as the minimization of a concave function on a polyhedral set. Other formulations of these problems utilize linear programs with equilibrium constraints [18, 1, 4, 3] which are generally intractable. In contrast, for the proposed concave minimization formulation, a successive linearization algorithm without stepsize terminates after a maximum average of 7 linear programs on problems with as many as 4192 points in 14dimensional space. The algorithm terminates at a stationary point or a global solution to the problem. Preliminary numerical results indicate that the proposed approach is quite effective and more efficient than other approaches. 1 Introduction We shall consider the following two fundamental problems of machine learning: Problem 1.1 Misclassification Minimization [24, 18] Given two finite point sets A and B in the ndimensional real s...
Mathematical Programming in Data Mining
 Data Mining and Knowledge Discovery
, 1996
"... Mathematical programming approaches to three fundamental problems will be described: feature selection, clustering and robust representation. The feature selection problem considered is that of discriminating between two sets while recognizing irrelevant and redundant features and suppressing them. ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
Mathematical programming approaches to three fundamental problems will be described: feature selection, clustering and robust representation. The feature selection problem considered is that of discriminating between two sets while recognizing irrelevant and redundant features and suppressing them. This creates a lean model that often generalizes better to new unseen data. Computational results on real data confirm improved generalization of leaner models. Clustering is exemplified by the unsupervised learning of patterns and clusters that may exist in a given database and is a useful tool for knowledge discovery in databases (KDD). A mathematical programming formulation of this problem is proposed that is theoretically justifiable and computationally implementable in a finite number of steps. A resulting kMedian Algorithm is utilized to discover very useful survival curves for breast cancer patients from a medical database. Robust representation is concerned with minimizing trained m...
Optimization Approaches to SemiSupervised Learning
, 2000
"... We examine mathematical models for semisupervised support vector machines (S VM). Given a training set of labeled data and a working set of unlabeled data, S VM constructs a support vector machine using both the training and working sets. We use S VM to solve the transductive inference problem pose ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
We examine mathematical models for semisupervised support vector machines (S VM). Given a training set of labeled data and a working set of unlabeled data, S VM constructs a support vector machine using both the training and working sets. We use S VM to solve the transductive inference problem posed by Vapnik. In transduction, the task is to estimate the value of a classification function at the given points in the working set. This contrasts with inductive inference which estimates the classification function at all possible values. We propose a general S VM model that minimizes both the misclassification error and the function capacity based on all the available data. Depending on how poorlyestimated unlabeled data are penalized, different mathematical models result. We examine several practical algorithms for solving these model. The first approach utilizes the S VM model for 1norm linear support vector machines converted to a mixedinteger program (MIP). A global solution of the ...
Data Discrimination via Nonlinear Generalized Support Vector Machines
 Complementarity: Applications, Algorithms and Extensions
, 1999
"... The main purpose of this paper is to show that new formulations of support vector machines can generate nonlinear separating surfaces which can discriminate between elements of a given set better than a linear surface. The principal approach used is that of generalized support vector machines (GSVMs ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
The main purpose of this paper is to show that new formulations of support vector machines can generate nonlinear separating surfaces which can discriminate between elements of a given set better than a linear surface. The principal approach used is that of generalized support vector machines (GSVMs) which employ possibly indefinite kernels [17]. The GSVM training procedure is carried out by either the simple successive overrelaxation (SOR) [18] iterative method or by linear programming. This novel combination of powerful support vector machines [24, 5] with the highly effective SOR computational algorithm [15, 16, 14] or with linear programming allows us to use a nonlinear surface to discriminate between elements of a dataset that belong to one of two categories. Numerical results on a number of datasets show improved testing set correctness, by as much as a factor of two, when comparing the nonlinear GSVM surface to a linear separating surface. 1 Introduction A very simple convex qu...
Optimization Methods In Massive Datasets
"... We describe the role of generalized support vector machines in separating massive and complex data using arbitrary nonlinear kernels. Feature selection that improves generalization is implemented via an effective procedure that utilizes a polyhedral norm or a concave function minimization. Massive d ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We describe the role of generalized support vector machines in separating massive and complex data using arbitrary nonlinear kernels. Feature selection that improves generalization is implemented via an effective procedure that utilizes a polyhedral norm or a concave function minimization. Massive data is separated using a linear programming chunking algorithm as well as a successive overrelaxation algorithm, each of which is capable of processing data with millions of points. 1 2 1. INTRODUCTION We address here the problem of classifying data in ndimensional real (Euclidean) space R n into one of two disjoint nite point sets (i.e. classes). The support vector machine (SVM) approach to classication [57, 2, 25, 58, 13, 54, 55] attempts to separate points belonging to two given sets in R n by a nonlinear surface, often only implicitly dened by a kernel function. Since the nonlinear surface in R n is typically linear in its parameters, it can be represented as a linear func...
Modeling Languages and Condor: Metacomputing for Optimization
, 1998
"... A generic framework for utilizing the computational resources provided by a metacomputer to concurrently solve several optimization problems generated by a modeling language is postulated. A mechanism using the Condor resource manager and the AMPL and GAMS languages is developed and applied to a ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
A generic framework for utilizing the computational resources provided by a metacomputer to concurrently solve several optimization problems generated by a modeling language is postulated. A mechanism using the Condor resource manager and the AMPL and GAMS languages is developed and applied to a technique for solving a mixed integer programming formulation of the feature selection problem. Due to the method's computational requirements, the ability to perform optimizations in parallel is necessary in order to obtain results within a reasonable amount of time. We provide details about our simple and easy to use tool and implementation so that other modelers with applications generating many independent mathematical programs can take advantage of it to significantly reduce duration. 1 Introduction One branch of the machine learning community, those researching supervised learning [3, 18, 19], attempts to construct a process based upon historical data for the purpose of forec...