Results 1  10
of
12
The why and how of nonnegative matrix factorization
 REGULARIZATION, OPTIMIZATION, KERNELS, AND SUPPORT VECTOR MACHINES. CHAPMAN & HALL/CRC
, 2014
"... ..."
(Show Context)
Efficient Distributed Topic Modeling with Provable Guarantees
"... Topic modeling for largescale distributed webcollections requires distributed techniques that account for both computational and communication costs. We consider topic modeling under the separability assumption and develop novel computationally efficient methods that provably achieve the statisti ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
Topic modeling for largescale distributed webcollections requires distributed techniques that account for both computational and communication costs. We consider topic modeling under the separability assumption and develop novel computationally efficient methods that provably achieve the statistical performance of the stateoftheart centralized approaches while requiring insignificant communication between the distributed document collections. We achieve tradeoffs between communication and computation without actually transmitting the documents. Our scheme is based on exploiting the geometry of normalized wordword cooccurrence matrix and viewing each row of this matrix as a vector in a highdimensional space. We relate the solid angle subtended by extreme points of the convex hull of these vectors to topic identities and construct distributed schemes to identify topics. 1
Necessary and sufficient conditions for novel word detection in separable topic models
 In Advances in on Neural Information Processing Systems (NIPS), Workshop on Topic Models: Computation, Application, Lake Tahoe
, 2013
"... ar ..."
(Show Context)
Ellipsoidal Rounding for Nonnegative Matrix Factorization Under Noisy Separability
, 2013
"... We present a numerical algorithm for nonnegative matrix factorization (NMF) problems under noisy separability. An NMF problem under separability can be stated as one of finding all vertices of the convex hull of data points. The research interest of this paper is to find the vectors as close to the ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We present a numerical algorithm for nonnegative matrix factorization (NMF) problems under noisy separability. An NMF problem under separability can be stated as one of finding all vertices of the convex hull of data points. The research interest of this paper is to find the vectors as close to the vertices as possible in a situation in which noise is added to the data points. Our algorithm is designed to capture the shape of the convex hull of data points by using its enclosing ellipsoid. We show that the algorithm has correctness and robustness properties from theoretical and practical perspectives; correctness here means that if the data points do not contain any noise, the algorithm can find the vertices of their convex hull; robustness means that if the data points contain noise, the algorithm can find the nearvertices. Finally, we apply the algorithm to document clustering, and report the experimental results.
On some provably correct cases of variational inference for topic models.
 In NIPS,
, 2015
"... Abstract Variational inference is an efficient, popular heuristic used in the context of latent variable models. We provide the first analysis of instances where variational inference algorithms converge to the global optimum, in the setting of topic models. Our initializations are natural, one of ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract Variational inference is an efficient, popular heuristic used in the context of latent variable models. We provide the first analysis of instances where variational inference algorithms converge to the global optimum, in the setting of topic models. Our initializations are natural, one of them being used in LDAc, the most popular implementation of variational inference. In addition to providing intuition into why this heuristic might work in practice, the multiplicative, rather than additive nature of the variational inference updates forces us to use nonstandard proof arguments, which we believe might be of general theoretical interest.
A Topic Modeling Approach to Rank Aggregation
"... We propose a new model for rank aggregation from pairwise comparisons that captures both ranking heterogeneity across users and ranking inconsistency for each user. We establish a formal statistical equivalence between the new model and topic models. We leverage recent advances in the topic modeling ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We propose a new model for rank aggregation from pairwise comparisons that captures both ranking heterogeneity across users and ranking inconsistency for each user. We establish a formal statistical equivalence between the new model and topic models. We leverage recent advances in the topic modeling literature to develop an algorithm that can learn shared latent rankings with provable statistical and computational efficiency guarantees. The method is also shown to empirically outperform competing approaches on some semisynthetic and real world datasets. 1
Successive Nonnegative Projection Algorithm for Robust Nonnegative Blind Source Separation
"... ar ..."
(Show Context)
A Topic Modeling Approach to Ranking
"... We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user be ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user behavior in a natural way. We show how the estimation of latent rankings in the new generative model can be formally reduced to the estimation of topics in a statistically equivalent topic modeling problem. We leverage recent advances in the topic modeling literature to develop an algorithm that can learn shared latent rankings with provable consistency as well as sample and computational complexity guarantees. We demonstrate that the new approach is empirically competitive with the current stateoftheart approaches in predicting preferences on some semisynthetic and real world datasets. 1
Most Large Topic Models are Approximately Separable
"... AbstractSeparability has recently been leveraged as a key structural condition in topic models to develop asymptotically consistent algorithms with polynomial statistical and computational efficiency guarantees. Separability corresponds to the presence of at least one novel word for each topic. Em ..."
Abstract
 Add to MetaCart
(Show Context)
AbstractSeparability has recently been leveraged as a key structural condition in topic models to develop asymptotically consistent algorithms with polynomial statistical and computational efficiency guarantees. Separability corresponds to the presence of at least one novel word for each topic. Empirical estimates of topic matrices for Latent Dirichlet Allocation models have been observed to be approximately separable. Separability may be a convenient structural property, but it appears to be too restrictive a condition. In this paper we explicitly demonstrate that separability is, in fact, an inevitable consequence of highdimensionality. In particular, we prove that when the columns of the topic matrix are independently sampled from a Dirichlet distribution, the resulting topic matrix will be approximately separable with probability tending to one as the number of rows (vocabulary size) scales to infinity sufficiently faster than the number of columns (topics). This is based on combining concentration of measure results with properties of the Dirichlet distribution and union bounding arguments. Our proof techniques can be extended to other priors for general nonnegative matrices.
LEARNING SHARED RANKINGS FROM MIXTURES OF NOISY PAIRWISE COMPARISONS
"... We propose a novel model for rank aggregation from pairwise comparisons which accounts for a heterogeneous population of inconsistent users whose preferences are different mixtures of multiple shared ranking schemes. By connecting this problem to recent advances in the nonnegative matrix factoriz ..."
Abstract
 Add to MetaCart
(Show Context)
We propose a novel model for rank aggregation from pairwise comparisons which accounts for a heterogeneous population of inconsistent users whose preferences are different mixtures of multiple shared ranking schemes. By connecting this problem to recent advances in the nonnegative matrix factorization (NMF) literature, we develop an algorithm that can learn the underlying shared rankings with provable statistical and computational efficiency guarantees. We validate the approach using semisynthetic and real world datasets. Index Terms — Rank aggregation, nonnegative matrix factorization, extreme point finding, random projection 1.