Results 1  10
of
55
Learning the Kernel Matrix with SemiDefinite Programming
, 2002
"... Kernelbased learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information ..."
Abstract

Cited by 548 (25 self)
 Add to MetaCart
Kernelbased learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the socalled kernel matrix, a symmetric and positive definite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input spaceclassical model selection problems in machine learning. In this paper we show how the kernel matrix can be learned from data via semidefinite programming (SDP) techniques. When applied
Multiple kernel learning, conic duality, and the SMO algorithm
 In Proceedings of the 21st International Conference on Machine Learning (ICML
, 2004
"... While classical kernelbased classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimiz ..."
Abstract

Cited by 278 (32 self)
 Add to MetaCart
While classical kernelbased classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimization of the coefficients of such a combination reduces to a convex optimization problem known as a quadraticallyconstrained quadratic program (QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are essential in largescale implementations of the SVM cannot be applied because the cost function is nondifferentiable. We propose a novel dual formulation of the QCQP as a secondorder cone programming problem, and show how to exploit the technique of MoreauYosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMObased algorithm is significantly more efficient than the generalpurpose interior point methods available in current optimization toolboxes. 1.
A robust minimax approach to classification
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2002
"... When constructing a classifier, the probability of correct classification of future data points should be maximized. We consider a binary classification problem where the mean and covariance matrix of each class are assumed to be known. No further assumptions are made with respect to the classcondi ..."
Abstract

Cited by 61 (7 self)
 Add to MetaCart
When constructing a classifier, the probability of correct classification of future data points should be maximized. We consider a binary classification problem where the mean and covariance matrix of each class are assumed to be known. No further assumptions are made with respect to the classconditional distributions. Misclassification probabilities are then controlled in a worstcase setting: that is, under all possible choices of classconditional densities with given mean and covariance matrix, we minimize the worstcase (maximum) probability of misclassification of future data points. For a linear decision boundary, this desideratum is translated in a very direct way into a (convex) second order cone optimization problem, with complexity similar to a support vector machine problem. The minimax problem can be interpreted geometrically as minimizing the maximum of the Mahalanobis distances to the two classes. We address the issue of robustness with respect to estimation errors (in the means and covariances of the
KNITRO: An integrated package for nonlinear optimization
 Large Scale Nonlinear Optimization, 35–59, 2006
, 2006
"... This paper describes Knitro 5.0, a Cpackage for nonlinear optimization that combines complementary approaches to nonlinear optimization to achieve robust performance over a wide range of application requirements. The package is designed for solving largescale, smooth nonlinear programming problems ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
This paper describes Knitro 5.0, a Cpackage for nonlinear optimization that combines complementary approaches to nonlinear optimization to achieve robust performance over a wide range of application requirements. The package is designed for solving largescale, smooth nonlinear programming problems, and it is also effective for the following special cases: unconstrained optimization, nonlinear systems of equations, least squares, and linear and quadratic programming. Various algorithmic options are available, including two interior methods and an activeset method. The package provides crossover techniques between algorithmic options as well as automatic selection of options and settings. 1
Extracting Shared Subspace for Multilabel Classification
"... Multilabel problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multipl ..."
Abstract

Cited by 28 (1 self)
 Add to MetaCart
Multilabel problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construct a binary classifier for each label, resulting in a set of independent binary classification problems. Since the multiple labels share the same input space, and the semantics conveyed by different labels are usually correlated, it is essential to exploit the correlation information contained in different labels. In this paper, we consider a general framework for extracting shared structures in multilabel classification. In this framework, a common subspace is assumed to be shared among multiple labels. We show that the optimal solution to the proposed formulation can be obtained by solving a generalized eigenvalue problem, though the problem is nonconvex. For highdimensional problems, direct computation of the solution is expensive, and we develop an efficient algorithm for this case. One appealing feature of the proposed framework is that it includes several wellknown algorithms as special cases, thus elucidating their intrinsic relationships. We have conducted extensive experiments on eleven multitopic web page categorization tasks, and results demonstrate the effectiveness of the proposed formulation in comparison with several representative algorithms.
Multiclass Discriminant Kernel Learning via Convex Programming
"... Regularized kernel discriminant analysis (RKDA) performs linear discriminant analysis in the feature space via the kernel trick. Its performance depends on the selection of kernels. In this paper, we consider the problem of multiple kernel learning (MKL) for RKDA, in which the optimal kernel matrix ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
Regularized kernel discriminant analysis (RKDA) performs linear discriminant analysis in the feature space via the kernel trick. Its performance depends on the selection of kernels. In this paper, we consider the problem of multiple kernel learning (MKL) for RKDA, in which the optimal kernel matrix is obtained as a linear combination of prespecified kernel matrices. We show that the kernel learning problem in RKDA can be formulated as convex programs. First, we show that this problem can be formulated as a semidefinite program (SDP). Based on the equivalence relationship between RKDA and least square problems in the binaryclass case, we propose a convex quadratically constrained quadratic programming (QCQP) formulation for kernel learning in RKDA. A semiinfinite linear programming (SILP) formulation is derived to further improve the efficiency. We extend these formulations to the multiclass case based on a key result established in this paper. That is, the multiclass RKDA kernel learning problem can be decomposed into a set of binaryclass kernel learning problems which are constrained to share a common kernel. Based on this decomposition property, SDP formulations are proposed for the multiclass case. Furthermore, it leads naturally to QCQP and SILP formulations. As the performance of RKDA depends on the regularization parameter, we show that this parameter can also be optimized in a joint framework with the kernel. Extensive experiments have been conducted and analyzed, and connections to other algorithms are discussed.
InteriorPoint Methods for Linear Optimization
, 2000
"... Everyone with some background in Mathematics knows how to solve a system of linear equalities, since it is the basic subject in Linear Algebra. In many practical problems, however, also inequalities play a role. For example, a budget usually may not be larger than some specified amount. In such situ ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Everyone with some background in Mathematics knows how to solve a system of linear equalities, since it is the basic subject in Linear Algebra. In many practical problems, however, also inequalities play a role. For example, a budget usually may not be larger than some specified amount. In such situations one may end up with a system of linear relations that not only contains equalities but also inequalities. Solving such a system requires methods and theory that go beyond the standard Mathematical knowledge. Nevertheless the topic has a rich history and is tightly related to the important topic of Linear Optimization, where the object is to nd the optimal (minimal or maximal) value of a linear function subject to linear constraints on the variables; the constraints may be either equality or inequality constraints. Both from a theoretical and computational point of view both topics are equivalent. In this chapter we describe the ideas underlying a new class of solution methods...
Lean clausesets: Generalizations of minimally unsatisfiable clausesets
 Discrete Applied Mathematics
, 2000
"... We study the problem of (efficiently) deleting such clauses from conjunctive normal forms (clausesets) which can not contribute to any proof of unsatisfiability. For that purpose we introduce the notion of an autarky system, associated with a canonical normal form for every clauseset by deleti ..."
Abstract

Cited by 15 (8 self)
 Add to MetaCart
We study the problem of (efficiently) deleting such clauses from conjunctive normal forms (clausesets) which can not contribute to any proof of unsatisfiability. For that purpose we introduce the notion of an autarky system, associated with a canonical normal form for every clauseset by deleting superfluous clauses. Clausesets where no clauses can be deleted are called lean, a natural generalization of minimally unsatisfiable clausesets, opening the possibility for combinatorial approaches (and including also satisfiable instances). Three special examples for autarky systems are considered: general autarkies, linear autarkies (based on linear programming) and matching autarkies (based on matching theory). We give new characterizations of lean and linearly lean clausesets by "universal linear programming problems," while matching lean clausesets are characterized in terms of "deficiency, " the difference between the number of clauses and the number of variables, and ...
G.: Robust classification with interval data
, 2003
"... We consider a binary, linear classification problem in which the data points are assumed to be unknown, but bounded within given hyperrectangles, i.e., the covariates are bounded within intervals explicitly given for each data point separately. We address the problem of designing a robust classifie ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
We consider a binary, linear classification problem in which the data points are assumed to be unknown, but bounded within given hyperrectangles, i.e., the covariates are bounded within intervals explicitly given for each data point separately. We address the problem of designing a robust classifier in this setting by minimizing the worstcase value of a given loss function, over all possible choices of the data in these multidimensional intervals. We examine in detail the application of this methodology to three specific loss functions, arising in support vector machines, in logistic regression and in minimax probability machines. We show that in each case, the resulting problem is amenable to efficient interiorpoint algorithms for convex optimization. The methods tend to produce sparse classifiers, i.e., they induce many zero coefficients in the resulting weight vectors, and we provide some theoretical grounds for this property. After presenting possible extensions of this framework to handle label errors and other uncertainty models, we discuss in some detail our implementation, which exploits the potential sparsity or a more general property referred to as regularity, of the input matrices. 1
Interior Point Methods: Current Status And Future Directions
, 1997
"... This article provides a synopsis of the major developments in interior point methods for mathematical programming in the last thirteen years, and discusses current and future research directions in interior point methods, with a brief selective guide to the research literature. AMS Subject Classific ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
This article provides a synopsis of the major developments in interior point methods for mathematical programming in the last thirteen years, and discusses current and future research directions in interior point methods, with a brief selective guide to the research literature. AMS Subject Classification: 90C, 90C05, 90C60 Keywords: Linear Programming, Newton's Method, Interior Point Methods, Barrier Method, Semidefinite Programming, SelfConcordance, Convex Programming, Condition Numbers 1 An earlier version of this article has previously appeared in OPTIMA  Mathematical Programming Society Newsletter No. 51, 1996 2 M.I.T. Sloan School of Management, Building E40149A, Cambridge, MA 02139, USA. email: rfreund@mit.edu 3 The Institute of Statistical Mathematics, 467 MinamiAzabu, Minatoku, Tokyo 106 JAPAN. email: mizuno@ism.ac.jp INTERIOR POINT METHODS 1 1 Introduction and Synopsis The purpose of this article is twofold: to provide a synopsis of the major developments in ...