Results 1 - 10
of
98
A tutorial on support vector machines for pattern recognition
- Data Mining and Knowledge Discovery
, 1998
"... The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SV ..."
Abstract
-
Cited by 1656 (11 self)
- Add to MetaCart
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.
Making Large-Scale Support Vector Machine Learning Practical
, 1998
"... Training a support vector machine (SVM) leads to a quadratic optimization problem with bound constraints and one linear equality constraint. Despite the fact that this type of problem is well understood, there are many issues to be considered in designing an SVM learner. In particular, for large lea ..."
Abstract
-
Cited by 345 (1 self)
- Add to MetaCart
Training a support vector machine (SVM) leads to a quadratic optimization problem with bound constraints and one linear equality constraint. Despite the fact that this type of problem is well understood, there are many issues to be considered in designing an SVM learner. In particular, for large learning tasks with many training examples, off-the-shelf optimization techniques for general quadratic programs quickly become intractable in their memory and time requirements. SV M light1 is an implementation of an SVM learner which addresses the problem of large tasks. This chapter presents algorithmic and computational results developed for SV M light V2.0, which make large-scale SVM training more practical. The results give guidelines for the application of SVMs to large domains.
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract
-
Cited by 309 (1 self)
- Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
Sparse Greedy Matrix Approximation for Machine Learning
, 2000
"... In kernel based methods such as Regularization Networks large datasets pose signi- cant problems since the number of basis functions required for an optimal solution equals the number of samples. We present a sparse greedy approximation technique to construct a compressed representation of the ..."
Abstract
-
Cited by 139 (9 self)
- Add to MetaCart
In kernel based methods such as Regularization Networks large datasets pose signi- cant problems since the number of basis functions required for an optimal solution equals the number of samples. We present a sparse greedy approximation technique to construct a compressed representation of the design matrix. Experimental results are given and connections to Kernel-PCA, Sparse Kernel Feature Analysis, and Matching Pursuit are pointed out. 1. Introduction Many recent advances in machine learning such as Support Vector Machines [Vapnik, 1995], Regularization Networks [Girosi et al., 1995], or Gaussian Processes [Williams, 1998] are based on kernel methods. Given an m-sample f(x 1 ; y 1 ); : : : ; (x m ; y m )g of patterns x i 2 X and target values y i 2 Y these algorithms minimize the regularized risk functional min f2H R reg [f ] = 1 m m X i=1 c(x i ; y i ; f(x i )) + 2 kfk 2 H : (1) Here H denotes a reproducing kernel Hilbert space (RKHS) [Aronszajn, 1950],...
Cluster Reserves: A Mechanism for Resource Management in Cluster-based Network Servers
- In Proceedings of the ACM SIGMETRICS Conference
, 2000
"... In network (e.g., Web) servers, it is often desirable to isolate the performance of different classes of requests from each other. That is, one seeks to achieve that a certain minimal proportion of server resources are available for a class of requests, independent of the load imposed by other reque ..."
Abstract
-
Cited by 137 (4 self)
- Add to MetaCart
In network (e.g., Web) servers, it is often desirable to isolate the performance of different classes of requests from each other. That is, one seeks to achieve that a certain minimal proportion of server resources are available for a class of requests, independent of the load imposed by other requests. Recent work demonstrates how to achieve this performance isolation in servers consisting of a single, centralized node; however, achieving performance isolation in a distributed, cluster based server remains a problem. This paper introduces a new abstraction, the cluster reserve, which represents a resource principal in a cluster based network server. We present a design and evaluate a prototype implementation that extends existing techniques for performance isolation on a single node server to cluster based servers. In our design, the dynamic cluster-wide resource management problem is formulated as a constrained optimization problem, with the resource allocations on individual machin...
The Connection between Regularization Operators and Support Vector Kernels
, 1998
"... In this paper a correspondence is derived between regularization operators used in Regularization Networks and Support Vector Kernels. We prove that the Green's Functions associated with regularization operators are suitable Support Vector Kernels with equivalent regularization properties. Moreover ..."
Abstract
-
Cited by 119 (35 self)
- Add to MetaCart
In this paper a correspondence is derived between regularization operators used in Regularization Networks and Support Vector Kernels. We prove that the Green's Functions associated with regularization operators are suitable Support Vector Kernels with equivalent regularization properties. Moreover the paper provides an analysis of currently used Support Vector Kernels in the view of regularization theory and corresponding operators associated with the classes of both polynomial kernels and translation invariant kernels. The latter are also analyzed on periodical domains. As a by-product we show that a large number of Radial Basis Functions, namely conditionally positive definite functions, may be used as Support Vector kernels.
An Interior-Point Algorithm For Nonconvex Nonlinear Programming
- COMPUTATIONAL OPTIMIZATION AND APPLICATIONS
, 1997
"... The paper describes an interior-point algorithm for nonconvex nonlinear programming which is a direct extension of interior--point methods for linear and quadratic programming. Major modifications include a merit function and an altered search direction to ensure that a descent direction for the mer ..."
Abstract
-
Cited by 116 (12 self)
- Add to MetaCart
The paper describes an interior-point algorithm for nonconvex nonlinear programming which is a direct extension of interior--point methods for linear and quadratic programming. Major modifications include a merit function and an altered search direction to ensure that a descent direction for the merit function is obtained. Preliminary numerical testing indicates that the method is robust. Further, numerical comparisons with MINOS and LANCELOT show that the method is efficient, and has the promise of greatly reducing solution times on at least some classes of models.
The analysis of decomposition methods for support vector machines
- IEEE Transactions on Neural Networks
, 1999
"... Abstract. The decomposition method is currently one of the major methods for solving support vector machines. An important issue of this method is the selection of working sets. In this paper through the design of decomposition methods for bound-constrained SVM formulations we demonstrate that the w ..."
Abstract
-
Cited by 79 (17 self)
- Add to MetaCart
Abstract. The decomposition method is currently one of the major methods for solving support vector machines. An important issue of this method is the selection of working sets. In this paper through the design of decomposition methods for bound-constrained SVM formulations we demonstrate that the working set selection is not a trivial task. Then from the experimental analysis we propose a simple selection of the working set which leads to faster convergences for difficult cases. Numerical experiments on different types of problems are conducted to demonstrate the viability of the proposed method.
On a Kernel-based Method for Pattern Recognition, Regression, Approximation, and Operator Inversion
, 1997
"... We present a Kernel--based framework for Pattern Recognition, Regression Estimation, Function Approximation and multiple Operator Inversion. Previous approaches such as ridge-regression, Support Vector methods and regression by Smoothing Kernels are included as special cases. We will show connection ..."
Abstract
-
Cited by 67 (22 self)
- Add to MetaCart
We present a Kernel--based framework for Pattern Recognition, Regression Estimation, Function Approximation and multiple Operator Inversion. Previous approaches such as ridge-regression, Support Vector methods and regression by Smoothing Kernels are included as special cases. We will show connections between the cost-function and some properties up to now believed to apply to Support Vector Machines only. The optimal solution of all the problems described above can be found by solving a simple quadratic programming problem. The paper closes with a proof of the equivalence between Support Vector kernels and Greene's functions of regularization operators.
Interior-point methods for nonconvex nonlinear programming: Filter methods and merit functions
- Computational Optimization and Applications
, 2002
"... Abstract. In this paper, we present global and local convergence results for an interior-point method for nonlinear programming and analyze the computational performance of its implementation. The algorithm uses an ℓ1 penalty approach to relax all constraints, to provide regularization, and to bound ..."
Abstract
-
Cited by 64 (5 self)
- Add to MetaCart
Abstract. In this paper, we present global and local convergence results for an interior-point method for nonlinear programming and analyze the computational performance of its implementation. The algorithm uses an ℓ1 penalty approach to relax all constraints, to provide regularization, and to bound the Lagrange multipliers. The penalty problems are solved using a simplified version of Chen and Goldfarb’s strictly feasible interior-point method [12]. The global convergence of the algorithm is proved under mild assumptions, and local analysis shows that it converges Q-quadratically for a large class of problems. The proposed approach is the first to simultaneously have all of the following properties while solving a general nonconvex nonlinear programming problem: (1) the convergence analysis does not assume boundedness of dual iterates, (2) local convergence does not require the Linear Independence Constraint Qualification, (3) the solution of the penalty problem is shown to locally converge to optima that may not satisfy the Karush-Kuhn-Tucker conditions, and (4) the algorithm is applicable to mathematical programs with equilibrium constraints. Numerical testing on a set of general nonlinear programming problems, including degenerate problems and infeasible problems, confirm the theoretical results. We also provide comparisons to a highly-efficient nonlinear solver and thoroughly analyze the effects of enforcing theoretical convergence guarantees on the computational performance of the algorithm. 1.

