Results 1 - 10
of
60
Stable principal component pursuit
- In Proc. of International Symposium on Information Theory
, 2010
"... We consider the problem of recovering a target matrix that is a superposition of low-rank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured high-dimensional signals such as videos and hyperspectral images, as well as in the analys ..."
Abstract
-
Cited by 94 (3 self)
- Add to MetaCart
(Show Context)
We consider the problem of recovering a target matrix that is a superposition of low-rank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured high-dimensional signals such as videos and hyperspectral images, as well as in the analysis of transformation invariant low-rank structure recovery. We analyze the performance of the natural convex heuristic for solving this problem, under the assumption that measurements are chosen uniformly at random. We prove that this heuristic exactly recovers low-rank and sparse terms, provided the number of observations exceeds the number of intrinsic degrees of freedom of the component signals by a polylogarithmic factor. Our analysis introduces several ideas that may be of independent interest for the more general problem of compressed sensing and decomposing superpositions of multiple structured signals. 1
A class of submodular functions for document summarization
- IN THE 49TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (ACL-HLT
, 2011
"... We design a class of submodular functions meant for document summarization tasks. These functions each combine two terms, one which encourages the summary to be representative of the corpus, and the other which positively rewards diversity. Critically, our functions are monotone nondecreasing and su ..."
Abstract
-
Cited by 92 (22 self)
- Add to MetaCart
We design a class of submodular functions meant for document summarization tasks. These functions each combine two terms, one which encourages the summary to be representative of the corpus, and the other which positively rewards diversity. Critically, our functions are monotone nondecreasing and submodular, which means that an efficient scalable greedy optimization scheme has a constant factor guarantee of optimality. When evaluated on DUC 2004-2007 corpora, we obtain better than existing state-of-art results in both generic and query-focused document summarization. Lastly, we show that several well-established methods for document summarization correspond, in fact, to submodular function optimization, adding further evidence that submodular functions are a natural fit for document summarization.
Proximal Methods for Hierarchical Sparse Coding
, 2010
"... Sparse coding consists in representing signals as sparse linear combinations of atoms selected from a dictionary. We consider an extension of this framework where the atoms are further assumed to be embedded in a tree. This is achieved using a recently introduced tree-structured sparse regularizatio ..."
Abstract
-
Cited by 83 (18 self)
- Add to MetaCart
Sparse coding consists in representing signals as sparse linear combinations of atoms selected from a dictionary. We consider an extension of this framework where the atoms are further assumed to be embedded in a tree. This is achieved using a recently introduced tree-structured sparse regularization norm, which has proven useful in several applications. This norm leads to regularized problems that are difficult to optimize, and we propose in this paper efficient algorithms for solving them. More precisely, we show that the proximal operator associated with this norm is computable exactly via a dual approach that can be viewed as the composition of elementary proximal operators. Our procedure has a complexity linear, or close to linear, in the number of atoms, and allows the use of accelerated gradient techniques to solve the tree-structured sparse approximation problem at the same computational cost as traditional ones using the ℓ1-norm. Our method is efficient and scales gracefully to millions of variables, which we illustrate in two types of applications: first, we consider fixed hierarchical dictionaries of wavelets to denoise natural images. Then, we apply our optimization tools in the context of dictionary learning, where learned dictionary elements naturally organize in a prespecified arborescent structure, leading to a better performance in reconstruction of natural image patches. When applied to text documents, our method learns hierarchies of topics, thus providing a competitive alternative to probabilistic topic models.
Structured Sparsity through Convex Optimization
, 2012
"... Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the ℓ1-norm. In this paper, we consider sit ..."
Abstract
-
Cited by 47 (6 self)
- Add to MetaCart
(Show Context)
Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the ℓ1-norm. In this paper, we consider situations where we are not only interested in sparsity, but where some structural prior knowledge is available as well. We show that the ℓ1-norm can then be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures. We present applications to unsupervised learning, for structured sparse principal component analysis and hierarchical dictionary learning, and to supervised learning in the context of nonlinear variable selection.
Convex and network flow optimization for structured sparsity
- JMLR
, 2011
"... We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of ℓ2- or ℓ∞-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of ℓ2- or ℓ∞-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of ℓ∞norms can be computed exactly in polynomial time by solving a quadratic min-cost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with non-overlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multi-task learning of tree-structured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.
Efficient Minimization of Decomposable Submodular Functions
"... Many combinatorial problems arising in machine learning can be reduced to the problem of minimizing a submodular function. Submodular functions are a natural discrete analog of convex functions, and can be minimized in strongly polynomial time. Unfortunately, state-of-the-art algorithms for general ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
Many combinatorial problems arising in machine learning can be reduced to the problem of minimizing a submodular function. Submodular functions are a natural discrete analog of convex functions, and can be minimized in strongly polynomial time. Unfortunately, state-of-the-art algorithms for general submodular minimization are intractable for larger problems. In this paper, we introduce a novel subclass of submodular minimization problems that we call decomposable. Decomposable submodular functions are those that can be represented as sums of concave functions applied to modular functions. We develop an algorithm, SLG, that can efficiently minimize decomposable submodular functions with tens of thousands of variables. Our algorithm exploits recent results in smoothed convex minimization. We apply SLG to synthetic benchmarks and a joint classification-and-segmentation task, and show that it outperforms the state-of-the-art general purpose submodular minimization algorithms by several orders of magnitude. 1
Efficient Sparse Modeling with Automatic Feature Grouping
"... The grouping of features is highly beneficial in learning with high-dimensional data. It reduces the variance in the estimation and improves the stability of feature selection, leading to improved generalization. Moreover, it can also help in data understanding and interpretation. OSCAR is a recent ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
(Show Context)
The grouping of features is highly beneficial in learning with high-dimensional data. It reduces the variance in the estimation and improves the stability of feature selection, leading to improved generalization. Moreover, it can also help in data understanding and interpretation. OSCAR is a recent sparse modeling tool that achieves this by using a ℓ1-regularizer and a pairwise ℓ∞-regularizer. However, its optimization is computationally expensive. In this paper, we propose an efficient solver based on the accelerated gradient methods. We show that its key projection step can be solved by a simple iterative group merging algorithm. It is highly efficient and reduces the empirical time complexity from O(d3 ∼ d5) for the existing solvers to just O(d), where d is the number of features. Experimental results on toy and real-world data sets demonstrate that OSCAR is a competitive sparse modeling approach with the added ability of automatic feature grouping. 1.
Convex analysis and optimization with submodular functions: a tutorial
"... Set-functions appear in many areas of computer science and applied mathematics, such as machine learning [1, 2, 3, 4], computer vision [5, 6], operations research [7] or electrical networks [8]. Among these set-functions, submodular functions play an important role, similar to convex functions on ve ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
(Show Context)
Set-functions appear in many areas of computer science and applied mathematics, such as machine learning [1, 2, 3, 4], computer vision [5, 6], operations research [7] or electrical networks [8]. Among these set-functions, submodular functions play an important role, similar to convex functions on vector spaces. In this tutorial, the theory of submodular