Results 1 - 10
of
14
Algorithms and applications for approximate nonnegative matrix factorization
- Computational Statistics and Data Analysis
, 2006
"... In this paper we discuss the development and use of low-rank approximate nonnegative matrix factorization (NMF) algorithms for feature extraction and identification in the fields of text mining and spectral data analysis. The evolution and convergence properties of hybrid methods based on both spars ..."
Abstract
-
Cited by 81 (6 self)
- Add to MetaCart
In this paper we discuss the development and use of low-rank approximate nonnegative matrix factorization (NMF) algorithms for feature extraction and identification in the fields of text mining and spectral data analysis. The evolution and convergence properties of hybrid methods based on both sparsity and smoothness constraints for the resulting nonnegative matrix factors are discussed. The interpretability of NMF outputs in specific contexts are provided along with opportunities for future work in the modification of NMF algorithms for large-scale and time-varying datasets. Key words: nonnegative matrix factorization, text mining, spectral data analysis, email surveillance, conjugate gradient, constrained least squares.
Projected gradient methods for non-negative matrix factorization
- Neural Computation
, 2007
"... Non-negative matrix factorization (NMF) can be formulated as a minimiza-tion problem with bound constraints. Although bound-constrained optimization has been studied extensively in both theory and practice, so far no study has formally applied its techniques to NMF. In this paper, we propose two pro ..."
Abstract
-
Cited by 76 (1 self)
- Add to MetaCart
Non-negative matrix factorization (NMF) can be formulated as a minimiza-tion problem with bound constraints. Although bound-constrained optimization has been studied extensively in both theory and practice, so far no study has formally applied its techniques to NMF. In this paper, we propose two projected gradient methods for NMF, both of which exhibit strong optimization properties. We discuss efficient implementations and demonstrate that one of the proposed methods converges faster than the popular multiplicative update approach. A simple MATLAB code is also provided. 1
Generalized nonnegative matrix approximations with Bregman divergences
- In: Neural Information Proc. Systems
, 2005
"... Nonnegative matrix approximation (NNMA) is a recent technique for dimensionality reduction and data analysis that yields a parts based, sparse nonnegative representation for nonnegative input data. NNMA has found a wide variety of applications, including text analysis, document clustering, face/imag ..."
Abstract
-
Cited by 43 (4 self)
- Add to MetaCart
Nonnegative matrix approximation (NNMA) is a recent technique for dimensionality reduction and data analysis that yields a parts based, sparse nonnegative representation for nonnegative input data. NNMA has found a wide variety of applications, including text analysis, document clustering, face/image recognition, language modeling, speech processing and many others. Despite these numerous applications, the algorithmic development for computing the NNMA factors has been relatively deficient. This paper makes algorithmic progress by modeling and solving (using multiplicative updates) new generalized NNMA problems that minimize Bregman divergences between the input matrix and its lowrank approximation. The multiplicative update formulae in the pioneering work by Lee and Seung [11] arise as a special case of our algorithms. In addition, the paper shows how to use penalty functions for incorporating constraints other than nonnegativity into the problem. Further, some interesting extensions to the use of “link ” functions for modeling nonlinear relationships are also discussed. 1
Fast newton-type methods for the least squares nonnegative matrix approximation problem
- Statistical Analysis and Data Mining
, 2008
"... Nonnegative Matrix Approximation is an effective matrix decomposition technique that has proven to be useful for a wide variety of applications ranging from document analysis and image processing to bioinformatics. There exist a few algorithms for nonnegative matrix approximation (NNMA), for example ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
Nonnegative Matrix Approximation is an effective matrix decomposition technique that has proven to be useful for a wide variety of applications ranging from document analysis and image processing to bioinformatics. There exist a few algorithms for nonnegative matrix approximation (NNMA), for example, Lee & Seung’s multiplicative updates, alternating least squares, and certain gradient descent based procedures. All of these procedures suffer from either slow convergence, numerical instabilities, or at worst, theoretical unsoundness. In this paper we present new and improved algorithms for the least-squares NNMA problem, which are not only theoretically well-founded, but also overcome many of the deficiencies of other methods. In particular, we use non-diagonal gradient scaling to obtain rapid convergence. Our methods provide numerical results superior to both Lee & Seung’s method as well to the alternating least squares (ALS) heuristic, which is known to work well in some situations but has no theoretical guarantees (Berry et al. 2006). Our approach extends naturally to include regularization and box-constraints, without sacrificing convergence guarantees. We present experimental results on both synthetic and realworld datasets to demonstrate the superiority of our methods, in terms of better approximations as well as efficiency.
Enhanced line search: A novel method to accelerate Parafac
- in Eusipco’05
, 2005
"... Abstract. Several modifications have been proposed to speed up the alternating least squares (ALS) method of fitting the PARAFAC model. The most widely used is line search, which extrapolates from linear trends in the parameter changes over prior iterations to estimate the parameter values that woul ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
Abstract. Several modifications have been proposed to speed up the alternating least squares (ALS) method of fitting the PARAFAC model. The most widely used is line search, which extrapolates from linear trends in the parameter changes over prior iterations to estimate the parameter values that would be obtained after many additional ALS iterations. We propose some extensions of this approach that incorporate a more sophisticated extrapolation, using information on nonlinear trends in the parameters and changing all the parameter sets simultaneously. The new method, called “enhanced line search (ELS), ” can be implemented at different levels of complexity, depending on how many different extrapolation parameters (for different modes) are jointly optimized during each iteration. We report some tests of the simplest parameter version, using simulated data. The performance of this lowest-level of ELS depends on the nature of the convergence difficulty. It significantly outperforms standard LS when there is a “convergence bottleneck, ” a situation where some modes have almost collinear factors but others do not, but is somewhat less effective in classic “swamp ” situations where factors are highly collinear in all modes. This is illustrated by examples. To demonstrate how ELS can be adapted to different N-way decompositions, we also apply it to a four-way array to perform a blind identification of an under-determined mixture (UDM). Since analysis of this dataset happens to involve a serious convergence “bottleneck ” (collinear factors in two of the four modes), it provides another example of a situation in which ELS dramatically outperforms standard line search. Key words. PARAFAC, alternating least squares (ALS), line search, enhanced line search (ELS), acceleration, swamps, bottlenecks, collinear factors, degeneracy AMS subject classifications. Authors must provide DOI. 10.1137/06065577 1. Introduction. PARAFAC
Nonnegative matrix approximation: algorithms and applications
, 2006
"... Low dimensional data representations are crucial to numerous applications in machine learning, statistics, and signal processing. Nonnegative matrix approximation (NNMA) is a method for dimensionality reduction that respects the nonnegativity of the input data while constructing a low-dimensional ap ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Low dimensional data representations are crucial to numerous applications in machine learning, statistics, and signal processing. Nonnegative matrix approximation (NNMA) is a method for dimensionality reduction that respects the nonnegativity of the input data while constructing a low-dimensional approximation. NNMA has been used in a multitude of applications, though without commensurate theoretical development. In this report we describe generic methods for minimizing generalized divergences between the input and its low rank approximant. Some of our general methods are even extensible to arbitrary convex penalties. Our methods yield efficient multiplicative iterative schemes for solving the proposed problems. We also consider interesting extensions such as the use of penalty functions, non-linear relationships via “link ” functions, weighted errors, and multi-factor approximations. We present some experiments as an illustration of our algorithms. For completeness, the report also includes a brief literature survey of the various algorithms and the applications of NNMA. Keywords: Nonnegative matrix factorization, weighted approximation, Bregman divergence, multiplicative
Fast Projection-Based Methods for the Least Squares Nonnegative Matrix Approximation Problem
, 2007
"... Abstract: Nonnegative matrix approximation (NNMA) is a popular matrix decomposition technique that has proven to be useful across a diverse variety of fields with applications ranging from document analysis and image processing to bioinformatics and signal processing. Over the years, several algorit ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract: Nonnegative matrix approximation (NNMA) is a popular matrix decomposition technique that has proven to be useful across a diverse variety of fields with applications ranging from document analysis and image processing to bioinformatics and signal processing. Over the years, several algorithms for NNMA have been proposed, e.g. Lee and Seung’s multiplicative updates, alternating least squares (ALS), and gradient descent-based procedures. However, most of these procedures suffer from either slow convergence, numerical instability, or at worst, serious theoretical drawbacks. In this paper, we develop a new and improved algorithmic framework for the least-squares NNMA problem, which is not only theoretically well-founded, but also overcomes many deficiencies of other methods. Our framework readily admits powerful optimization techniques and as concrete realizations we present implementations based on the Newton, BFGS and conjugate gradient methods. Our algorithms provide numerical results superior to both Lee and Seung’s method as well as to the alternating least squares heuristic, which was reported to work well in some situations but has no theoretical guarantees [1]. Our approach extends naturally to include regularization and box-constraints without sacrificing convergence guarantees. We present experimental results on both synthetic and real-world datasets that demonstrate the superiority of our methods, both in terms of better approximations as well as
Receptor Modeling of Ambient Particulate Matter Data Using Positive Matrix Factorization: Review of Existing Methods
"... Methods for apportioning sources of ambient particulate matter (PM) using the positive matrix factorization (PMF) algorithm are reviewed. Numerous procedural decisions must be made and algorithmic parameters selected when analyzing PM data with PMF. However, few publications document enough of these ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Methods for apportioning sources of ambient particulate matter (PM) using the positive matrix factorization (PMF) algorithm are reviewed. Numerous procedural decisions must be made and algorithmic parameters selected when analyzing PM data with PMF. However, few publications document enough of these details for readers to evaluate, reproduce, or compare results between different studies. For example, few studies document why some species were used and others not used in the modeling, how the number of factors was selected, or how much uncertainty exists in the solutions. More thorough documentation will aid the development of standard protocols for analyzing PM data with PMF and will reveal more clearly where research is needed to help future analysts select from the various possible procedures and parameters available in PMF. For example, research likely is needed to determine optimal approaches for handling data below detection limits, ways to apportion PM mass among sources identified by PMF, and ways to estimate uncertainties in the solution. The review closes with recommendations for documenting the methodological details of future PMF analyses.
Modeling multi-way data with linearly dependent loadings y
"... A generalization/specialization of the PARAFAC model is developed that improves its properties when applied to multi-way problems involving linearly dependent factors. This model is called PARALIND (PARAllel profiles with LINear Dependences). Linear dependences can arise when the empirical sources o ..."
Abstract
- Add to MetaCart
A generalization/specialization of the PARAFAC model is developed that improves its properties when applied to multi-way problems involving linearly dependent factors. This model is called PARALIND (PARAllel profiles with LINear Dependences). Linear dependences can arise when the empirical sources of variation being modeled by factors are causally or logically linked during data generation, or circumstantially linked during data collection. For example, this can occur in a chemical context when end products are related to the precursor or in a psychological context when a single stimulus generates two incompatible feelings at once. For such cases, the most theoretically appropriate PARAFAC model has loading vectors that are linearly dependent in at least one mode, and when collinear, are nonunique in the others. However, standard PARAFAC analysis of fallible data will have neither of these features. Instead, latent linear dependences become high surface correlations and any latent nonuniqueness is replaced by a meaningless surface-level ‘unique orientation ’ that optimally fits the particular random noise in that sample. To avoid these problems, any set of components that in theory should be rank deficient are re-expressed in PARALIND as a product of two matrices, one that explicitly represents their dependency relationships and another, with fewer columns, that captures their patterns of variation. To demonstrate the approach, we apply it first to fluorescence spectroscopy (excitation-emission matrices, EEM) data in which concentration values for two analytes covary exactly, and then to flow injection analysis (FIA) data in which subsets of columns are logically constrained to sum to a constant, but differently in
Integer Matrix factorization and its Application
, 2005
"... Matrix factorization has been of fundamental importance in modern sciences and technology. This work investigates the notion of factorization with entries restricted to integers or binaries, where the “integer” could be either the regular ordinal integers or just some nominal labels. Being discrete ..."
Abstract
- Add to MetaCart
Matrix factorization has been of fundamental importance in modern sciences and technology. This work investigates the notion of factorization with entries restricted to integers or binaries, where the “integer” could be either the regular ordinal integers or just some nominal labels. Being discrete in nature, such a factorization or approximation cannot be accomplished by conventional techniques. Built upon a basic scheme of rank one approximation, an approach that recursively splits (approximates) the underlying matrix into a sum of rank one matrices with discrete entries is proposed. Various computational issues involved in this kind of factorization, which must take into account the metric being used for measurement, are addressed. The mechanism presented in this paper can handle multiple types of data. For application purposes, the discussion emphasizes mainly on binaryinteger factorizations. But the notion is readily generalizable with slight modifications to, for example, integer-integer factorizations. Applications to cluster analysis and pattern discovery are demonstrated through some real-world data. Of particular interest is the result on the ordering of rank one approximations which, in a remote sense, is the analogy for discrete data to the ordering of singular values for continuous data. As such, a truncated low rank factorization (for discrete data) analogous to the truncated singular value decomposition is also obtained.

