Results 1  10
of
19
A New Approximate Maximal Margin Classification Algorithm
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2001
"... A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data wi ..."
Abstract

Cited by 87 (5 self)
 Add to MetaCart
A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data with pnorm margin larger than (1 ) , where is the (normalized) pnorm margin of the data. alma p avoids quadratic (or higherorder) programming methods. It is very easy to implement and is as fast as online algorithms, such as Rosenblatt's Perceptron algorithm. We performed extensive experiments on both realworld and artificial datasets. We compared alma 2 (i.e., alma p with p = 2) to standard Support vector Machines (SVM) and to two incremental algorithms: the Perceptron algorithm and Li and Long's ROMMA. The accuracy levels achieved by alma 2 are superior to those achieved by the Perceptron algorithm and ROMMA, but slightly inferior to SVM's. On the other hand, alma 2 is quite faster and easier to implement than standard SVM training algorithms. When learning sparse target vectors, alma p with p > 2 largely outperforms Perceptronlike algorithms, such as alma 2 .
Constructing Boosting Algorithms from SVMs: An Application to Oneclass Classification
, 2002
"... ..."
Parsimonious Least Norm Approximation
, 1997
"... A theoretically justifiable fast finite successive linear approximation algorithm is proposed for obtaining a parsimonious solution to a corrupted linear system Ax = b + p, where the corruption p is due to noise or error in measurement. The proposed linearprogrammingbased algorithm finds a solutio ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
A theoretically justifiable fast finite successive linear approximation algorithm is proposed for obtaining a parsimonious solution to a corrupted linear system Ax = b + p, where the corruption p is due to noise or error in measurement. The proposed linearprogrammingbased algorithm finds a solution x by parametrically minimizing the number of nonzero elements in x and the error k Ax \Gamma b \Gamma p k 1 . Numerical tests on a signalprocessingbased example indicate that the proposed method is comparable to a method that parametrically minimizes the 1norm of the solution x and the error k Ax \Gamma b \Gamma p k 1 , and that both methods are superior, by orders of magnitude, to solutions obtained by least squares as well by combinatorially choosing an optimal solution with a specific number of nonzero elements. Keywords Minimal cardinality, least norm approximation 1 Introduction A wide range of important applications can be reduced to the problem of estimating a vector x by minim...
The Analysis of a Simple kMeans Clustering Algorithm
, 2000
"... Kmeans clustering is a very popular clustering technique, which is used in numerous applications. Given a set of n data points in R d and an integer k, the problem is to determine a set of k points R d , called centers, so as to minimize the mean squared distance from each data point to its nea ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Kmeans clustering is a very popular clustering technique, which is used in numerous applications. Given a set of n data points in R d and an integer k, the problem is to determine a set of k points R d , called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for kmeans clustering is Lloyd's algorithm. In this paper we present a simple and efficient implementation of Lloyd's kmeans clustering algorithm, which we call the filtering algorithm. This algorithm is very easy to implement. It differs from most other approaches in that it precomputes a kdtree data structure for the data points rather than the center points. We establish the practical efficiency of the filtering algorithm in two ways. First, we present a datasensitive analysis of the algorithm's running time. Second, we have implemented the algorithm and performed a number of empirical studies, both on synthetically generated data and on real data from...
Barrier Boosting
"... Boosting algorithms like AdaBoost and ArcGV are iterative strategies to minimize a constrained objective function, equivalent to Barrier algorithms. ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
Boosting algorithms like AdaBoost and ArcGV are iterative strategies to minimize a constrained objective function, equivalent to Barrier algorithms.
Subspace Information Criterion for NonQuadratic Regularizers  Model Selection for Sparse Regressors
 IEEE Transactions on Neural Networks
, 2002
"... Nonquadratic regularizers, in particular the # 1 norm regularizer can yield sparse solutions that generalize well. In this work we propose the Generalized Subspace Information Criterion (GSIC) that allows to predict the generalization error for this useful family of regularizers. We show that un ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
Nonquadratic regularizers, in particular the # 1 norm regularizer can yield sparse solutions that generalize well. In this work we propose the Generalized Subspace Information Criterion (GSIC) that allows to predict the generalization error for this useful family of regularizers. We show that under some technical assumptions GSIC is an asymptotically unbiased estimator of the generalization error. GSIC is demonstrated to have a good performance in experiments with the # 1 norm regularizer as we compare with the Network Information Criterion and crossvalidation in relatively large sample cases. However in the small sample case, GSIC tends to fail to capture the optimal model due to its large variance. Therefore, also a biased version of GSIC is introduced, which achieves reliable model selection in the relevant and challenging scenario of high dimensional data and few samples.
MinimumSupport Solutions of Polyhedral Concave Programs
 OPTIMIZATION
, 1999
"... Motivated by the successful application of mathematical programming techniques to difficult machine learning problems, we seek solutions of concave minimization problems over polyhedral sets with a minimum number of nonzero components. We prove that if such problems have a solution, they have a v ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Motivated by the successful application of mathematical programming techniques to difficult machine learning problems, we seek solutions of concave minimization problems over polyhedral sets with a minimum number of nonzero components. We prove that if such problems have a solution, they have a vertex solution with a minimal number of zeros. This includes linear programs and general linear complementarity problems. A smooth concave exponential approximation to a step function solves the minimumsupport problem exactly for a finite value of the smoothing parameter. A fast finite linearprogrammingbased iterative method terminates at a stationary point, which for many important real world problems provides very useful answers. Utilizing the complementarity property of linear programs and linear complementarity problems, an upper bound on the number of nonzeros can be obtained by solving a single convex minimization problem on a polyhedral set.
A genetic algorithm using hyperquadtrees for lowdimensional kmeans clustering
 IEEE Trans. on Pattern Analysis and Machine Intelligence
, 2006
"... Abstract—The kmeans algorithm is widely used for clustering because of its computational efficiency. Given n points in ddimensional space and the number of desired clusters k, kmeans seeks a set of k cluster centers so as to minimize the sum of the squared Euclidean distance between each point an ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Abstract—The kmeans algorithm is widely used for clustering because of its computational efficiency. Given n points in ddimensional space and the number of desired clusters k, kmeans seeks a set of k cluster centers so as to minimize the sum of the squared Euclidean distance between each point and its nearest cluster center. However, the algorithm is very sensitive to the initial selection of centers and is likely to converge to partitions that are significantly inferior to the global optimum. We present a genetic algorithm (GA) for evolving centers in the kmeans algorithm that simultaneously identifies good partitions for a range of values around a specified k. The set of centers is represented using a hyperquadtree constructed on the data. This representation is exploited in our GA to generate an initial population of good centers and to support a novel crossover operation that selectively passes good subsets of neighboring centers from parents to offspring by swapping subtrees. Experimental results indicate that our GA finds the global optimum for data sets with known optima and finds good solutions for large simulated data sets. Index Terms—kmeans algorithm, clustering, genetic algorithms, quadtrees, optimal partition, center selection. 1
SVM and Boosting: One Class
"... We show via an equivalence of mathematical programs that a Support Vector (SV) algorithm can be translated into an equivalent boostinglike algorithm and vice versa. We exemplify this translation procedure for a new algorithm oneclass Leveraging starting from the oneclass Support Vector Machine ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We show via an equivalence of mathematical programs that a Support Vector (SV) algorithm can be translated into an equivalent boostinglike algorithm and vice versa. We exemplify this translation procedure for a new algorithm oneclass Leveraging starting from the oneclass Support Vector Machines (1SVM) . This is a first step towards unsupervised learning in a Boosting framework.
Approximating Kmeanstype clustering via semidefinite programming
, 2005
"... One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sumofsquares (MSSC), which is known to be NPhard. In this paper, by using matrix arguments, we first model MSSC as a socalled 01 semidefinite programming (SDP). We show that our 01 SDP model p ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sumofsquares (MSSC), which is known to be NPhard. In this paper, by using matrix arguments, we first model MSSC as a socalled 01 semidefinite programming (SDP). We show that our 01 SDP model provides an unified framework for several clustering approaches such as normalized kcut and spectral clustering. Moreover, the 01 SDP model allows us to solve the underlying problem approximately via the relaxed linear and semidefinite programming. Secondly, we consider the issue of how to extract a feasible solution of the original MSSC model from the approximate solution of the relaxed SDP problem. By using principal component analysis, we develop a rounding procedure to construct a feasible partitioning from a solution of the relaxed problem. In our rounding procedure, we need to solve a kmeans clustering problem in ℜ k−1, which can be solved in O(n k2 (k−1)) time. In case of biclustering, the running time of our rounding procedure can be reduced to O(nlog n). We show that our algorithm can provide a 2approximate solution to the original problem. Promising numerical results based on our new method are reported.