Results 1 - 10
of
41
Fast and robust recursive algorithms for separable nonnegative matrix factorization. arXiv preprint arXiv:1208.1237
, 2012
"... ar ..."
(Show Context)
Robust Subspace Clustering
, 2013
"... Subspace clustering refers to the task of finding a multi-subspace representation that best fits a collection of points taken from a high-dimensional space. This paper introduces an algorithm inspired by sparse subspace clustering (SSC) [17] to cluster noisy data, and develops some novel theory demo ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
(Show Context)
Subspace clustering refers to the task of finding a multi-subspace representation that best fits a collection of points taken from a high-dimensional space. This paper introduces an algorithm inspired by sparse subspace clustering (SSC) [17] to cluster noisy data, and develops some novel theory demonstrating its correctness. In particular, the theory uses ideas from geometric functional analysis to show that the algorithm can accurately recover the underlying subspaces under minimal requirements on their orientation, and on the number of samples per subspace. Synthetic as well as real data experiments complement our theoretical study, illustrating our approach and demonstrating its effectiveness.
Fast conical hull algorithms for near-separable non-negative matrix factorization
- In ACM/IEEE conference on Supercomputing
, 2009
"... The separability assumption (Donoho & Stodden, 2003; Arora et al., 2012a) turns non-negative matrix factorization (NMF) into a tractable problem. Recently, a new class of provably-correct NMF algorithms have emerged under this assumption. In this paper, we reformulate the separable NMF problem a ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
(Show Context)
The separability assumption (Donoho & Stodden, 2003; Arora et al., 2012a) turns non-negative matrix factorization (NMF) into a tractable problem. Recently, a new class of provably-correct NMF algorithms have emerged under this assumption. In this paper, we reformulate the separable NMF problem as that of finding the extreme rays of the conical hull of a finite set of vectors. From this geometricperspective, we derive new separable NMF algorithms that are highly scalable and empirically noise robust, and haveseveralotherfavorablepropertiesin relation to existing methods. A parallel implementation of our algorithm demonstrates high scalability on shared- and distributedmemory machines. 1.
R.: Robust near-separable nonnegative matrix factorization using linear optimization
- Journal of Machine Learning Research
, 2014
"... ar ..."
(Show Context)
Topic Discovery through Data Dependent and Random Projections
"... We present algorithms for topic modeling based on the geometry of cross-document word-frequency patterns. This perspective gains significance under the so called separability condition. This is a condition on existence of novel-words that are unique to each topic. We present a suite of highly effici ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
(Show Context)
We present algorithms for topic modeling based on the geometry of cross-document word-frequency patterns. This perspective gains significance under the so called separability condition. This is a condition on existence of novel-words that are unique to each topic. We present a suite of highly efficient algorithms with provable guarantees based on data-dependent and random projections to identify novel words and associated topics. Our key insight here is that the maximum and minimum values of cross-document frequency patterns projected along any direction are associated with novel words. While our sample complexity bounds for topic recovery are similar to the state-ofart, the computational complexity of our random projection scheme scales linearly with the number of documents and the number of words per document. We present several experiments on synthetic and realworld datasets to demonstrate qualitative and quantitative merits of our scheme. 1.
Robustness analysis of Hottopixx, a linear programming model for factoring nonnegative matrices
- SIAM Journal on Matrix Analysis and Applications
, 2013
"... ar ..."
(Show Context)
1Convexity in source separation: Models, geometry, and algorithms
"... Source separation or demixing is the process of extracting multiple components entangled within a signal. Contemporary signal processing presents a host of difficult source separation problems, from interference cancellation to background subtraction, blind deconvolution, and even dictionary learnin ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
(Show Context)
Source separation or demixing is the process of extracting multiple components entangled within a signal. Contemporary signal processing presents a host of difficult source separation problems, from interference cancellation to background subtraction, blind deconvolution, and even dictionary learning. Despite the recent progress in each of these applications, advances in high-throughput sensor technology place demixing algorithms under pressure to accommodate extremely high-dimensional signals, separate an ever larger number of sources, and cope with more sophisticated signal and mixing models. These difficulties are exacerbated by the need for real-time action in automated decision-making systems. Recent advances in convex optimization provide a simple framework for efficiently solving numerous difficult demixing problems. This article provides an overview of the emerging field, explains the theory that governs the underlying procedures, and surveys algorithms that solve them efficiently. We aim to equip practitioners with a toolkit for constructing their own demixing algorithms that work, as well as concrete intuition for why they work. Fundamentals of demixing The most basic model for mixed signals is a superposition model, where we observe a mixed
The why and how of nonnegative matrix factorization
- REGULARIZATION, OPTIMIZATION, KERNELS, AND SUPPORT VECTOR MACHINES. CHAPMAN & HALL/CRC
, 2014
"... ..."
(Show Context)
Convex relaxations of structured matrix factorizations
, 2013
"... We consider the factorization of a rectangular matrix X into a positive linear combination of rank-one factors of the form uv ⊤ , where u and v belongs to certain sets U and V, that may encode specific structures regarding the factors, such as positivity or sparsity. In this paper, we show that comp ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
(Show Context)
We consider the factorization of a rectangular matrix X into a positive linear combination of rank-one factors of the form uv ⊤ , where u and v belongs to certain sets U and V, that may encode specific structures regarding the factors, such as positivity or sparsity. In this paper, we show that computing the optimal decomposition is equivalent to computing a certain gauge function of X and we provide a detailed analysis of these gauge functions and their polars. Since these gaugefunctions are typically hard to compute, we present semi-definite relaxations and several algorithms that may recover approximate decompositions with approximation guarantees. We illustrate our results with simulations on finding decompositions with elements in {0,1}. As side contributions, we present a detailed analysis of variational quadratic representations of norms as well as a new iterative basis pursuit algorithm that can deal with inexact first-order oracles. 1
Spectral Methods for Supervised Topic Models
"... Supervised topic models simultaneously model the latent topic structure of large collections of documents and a response variable associated with each docu-ment. Existing inference methods are based on either variational approximation or Monte Carlo sampling. This paper presents a novel spectral dec ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
Supervised topic models simultaneously model the latent topic structure of large collections of documents and a response variable associated with each docu-ment. Existing inference methods are based on either variational approximation or Monte Carlo sampling. This paper presents a novel spectral decomposition algo-rithm to recover the parameters of supervised latent Dirichlet allocation (sLDA) models. The Spectral-sLDA algorithm is provably correct and computationally efficient. We prove a sample complexity bound and subsequently derive a suffi-cient condition for the identifiability of sLDA. Thorough experiments on a diverse range of synthetic and real-world datasets verify the theory and demonstrate the practical effectiveness of the algorithm. 1