Results 1  10
of
115
Structured variable selection with sparsityinducing norms
, 904
"... We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsityinducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual ℓ1norm and the group ℓ1norm by allowing the subsets to ov ..."
Abstract

Cited by 97 (15 self)
 Add to MetaCart
We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsityinducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual ℓ1norm and the group ℓ1norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problems. We first explore the relationship between the groups defining the norm and the resulting nonzero patterns, providing both forward and backward algorithms to go back and forth from groups to patterns. This allows the design of norms adapted to specific prior knowledge expressed in terms of nonzero patterns. We also present an efficient active set algorithm, and analyze the consistency of variable selection for leastsquares linear regression in low and highdimensional settings.
Approximate Minimum Enclosing Balls in High Dimensions Using CoreSets
, 2003
"... this paper can be downloaded from http://www.compgeom.com/meb/. P. Kumar and J. Mitchell are partially supported by a grant from the National Science Foundation (CCR0098172) . J. Mitchell is also partially supported by grants from the Honda Fundamental Research Labs, Metron Aviation, NASAAmes Resear ..."
Abstract

Cited by 34 (8 self)
 Add to MetaCart
this paper can be downloaded from http://www.compgeom.com/meb/. P. Kumar and J. Mitchell are partially supported by a grant from the National Science Foundation (CCR0098172) . J. Mitchell is also partially supported by grants from the Honda Fundamental Research Labs, Metron Aviation, NASAAmes Research (NAG21325), and the USIsrael Binational Science Foundation. E. A. Yldrm is partially supported by an NSF CAREER award (DMI0237415)
Smoothing Proximal Gradient Method for General Structured Sparse Learning
"... We study the problem of learning high dimensional regression models regularized by a structuredsparsityinducing penalty that encodes prior structural information on either input or output sides. We consider two widely adopted types of such penalties as our motivating examples: 1) overlapping group ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
We study the problem of learning high dimensional regression models regularized by a structuredsparsityinducing penalty that encodes prior structural information on either input or output sides. We consider two widely adopted types of such penalties as our motivating examples: 1) overlapping group lasso penalty, based on the ℓ1/ℓ2 mixednorm penalty, and 2) graphguided fusion penalty. For both types of penalties, due to their nonseparability, developing an efficient optimization method has remained a challenging problem. In this paper, we propose a general optimization approach, called smoothing proximal gradient method, which can solve the structured sparse regression problems with a smooth convex loss and a wide spectrum of structuredsparsityinducing penalties. Our approach is based on a general smoothing technique of Nesterov [17]. It achieves a convergence rate faster than the standard firstorder method, subgradient method, and is much more scalable than the most widely used interiorpoint method. Numerical results are reported to demonstrate the efficiency and scalability of the proposed method. 1
Distance weighted discrimination
 Cornell University
"... High Dimension Low Sample Size statistical analysis is becoming increasingly important in a wide range of applied contexts. In such situations, it is seen that the popular Support Vector Machine suffers from “data piling ” at the margin, which can diminish generalizability. This leads naturally to t ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
High Dimension Low Sample Size statistical analysis is becoming increasingly important in a wide range of applied contexts. In such situations, it is seen that the popular Support Vector Machine suffers from “data piling ” at the margin, which can diminish generalizability. This leads naturally to the development of Distance Weighted Discrimination, which is based on Second Order Cone Programming, a modern computationally intensive optimization method.
Computation of Minimum Volume Covering Ellipsoids
 Operations Research
, 2003
"... We present a practical algorithm for computing the minimum volume ndimensional ellipsoid that must contain m given points a 1 , . . . , am . This convex constrained problem arises in a variety of applied computational settings, particularly in data mining and robust statistics. Its structur ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
We present a practical algorithm for computing the minimum volume ndimensional ellipsoid that must contain m given points a 1 , . . . , am . This convex constrained problem arises in a variety of applied computational settings, particularly in data mining and robust statistics. Its structure makes it particularly amenable to solution by interiorpoint methods, and it has been the subject of much theoretical complexity analysis. Here we focus on computation. We present a combined interiorpoint and activeset method for solving this problem. Our computational results demonstrate that our method solves very large problem instances (m = 30, 000 and n = 30) to a high degree of accuracy in under 30 seconds on a personal computer.
An implementable proximal point algorithmic framework for nuclear norm minimization
, 2010
"... The nuclear norm minimization problem is to find a matrix with the minimum nuclear norm subject to linear and second order cone constraints. Such a problem often arises from the convex relaxation of a rank minimization problem with noisy data, and arises in many fields of engineering and science. In ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
The nuclear norm minimization problem is to find a matrix with the minimum nuclear norm subject to linear and second order cone constraints. Such a problem often arises from the convex relaxation of a rank minimization problem with noisy data, and arises in many fields of engineering and science. In this paper, we study inexact proximal point algorithms in the primal, dual and primaldual forms for solving the nuclear norm minimization with linear equality and second order cone constraints. We design efficient implementations of these algorithms and present comprehensive convergence results. In particular, we investigate the performance of our proposed algorithms in which the inner subproblems are approximately solved by the gradient projection method or the accelerated proximal gradient method. Our numerical results for solving randomly generated matrix completion problems and real matrix completion problems show that our algorithms perform favorably in comparison to several recently proposed stateoftheart algorithms. Interestingly, our proposed algorithms are connected with other algorithms that have been studied in the literature. Key words. Nuclear norm minimization, proximal point method, rank minimization, gradient projection method, accelerated proximal gradient method.
ANGULAR SYNCHRONIZATION BY EIGENVECTORS AND SEMIDEFINITE PROGRAMMING: ANALYSIS AND APPLICATION TO CLASS AVERAGING IN CRYOELECTRON MICROSCOPY
, 905
"... Abstract. The angular synchronization problem is to obtain an accurate estimation (up to a constant additive phase) for a set of unknown angles θ1,..., θn from m noisy measurements of their offsets θi − θj mod 2π. Of particular interest is angle recovery in the presence of many outlier measurements ..."
Abstract

Cited by 20 (14 self)
 Add to MetaCart
Abstract. The angular synchronization problem is to obtain an accurate estimation (up to a constant additive phase) for a set of unknown angles θ1,..., θn from m noisy measurements of their offsets θi − θj mod 2π. Of particular interest is angle recovery in the presence of many outlier measurements that are uniformly distributed in [0,2π) and carry no information on the true offsets. We introduce an efficient recovery algorithm for the unknown angles from the top eigenvector of a specially designed Hermitian matrix. The eigenvector method is extremely stable and succeeds even when the number of outliers is exceedingly large. For example, we successfully estimate n = 400 angles from a full set of m = `400 ´ offset measurements of which 90 % are outliers in less than a second 2 on a commercial laptop. We use random matrix theory to prove that the eigenvector method q gives
ThreeDimensional Structure Determination from Common Lines in CryoEM by Eigenvectors and Semidefinite Programming ∗
"... Abstract. The cryoelectron microscopy reconstruction problem is to find the threedimensional (3D) structure of a macromolecule given noisy samples of its twodimensional projection images at unknown random directions. Present algorithms for finding an initial 3D structure model are based on the “a ..."
Abstract

Cited by 17 (11 self)
 Add to MetaCart
Abstract. The cryoelectron microscopy reconstruction problem is to find the threedimensional (3D) structure of a macromolecule given noisy samples of its twodimensional projection images at unknown random directions. Present algorithms for finding an initial 3D structure model are based on the “angular reconstitution ” method in which a coordinate system is established from three projections, and the orientation of the particle giving rise to each image is deduced from common lines among the images. However, a reliable detection of common lines is difficult due to the low signaltonoise ratio of the images. In this paper we describe two algorithms for finding the unknown imaging directions of all projections by minimizing global selfconsistency errors. In the first algorithm, the minimizer is obtained by computing the three largest eigenvectors of a specially designed symmetric matrix derived from the common lines, while the second algorithm is based on semidefinite programming (SDP). Compared with existing algorithms, the advantages of our algorithms are fivefold: first, they accurately estimate all orientations at very low commonline detection rates; second, they are extremely fast, as they involve only the computation of a few top eigenvectors or a sparse SDP; third, they are nonsequential and use the information in all common lines at once; fourth, they are amenable to a rigorous mathematical analysis using spectral analysis and random matrix theory; and finally, the algorithms are optimal in the sense that they reach the information theoretic Shannon bound up to a constant for an idealized probabilistic model.