Results 1 
5 of
5
Latent dirichlet allocation
 Journal of Machine Learning Research
, 2003
"... We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a threelevel hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, ..."
Abstract

Cited by 2350 (63 self)
 Add to MetaCart
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a threelevel hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model. 1.
Hierarchical topic models and the nested Chinese restaurant process
 Advances in Neural Information Processing Systems
, 2004
"... We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested ..."
Abstract

Cited by 188 (25 self)
 Add to MetaCart
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarily large branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts. 1
Learning diverse rankings with multiarmed bandits
 In Proceedings of the 25 th ICML
, 2008
"... Algorithms for learning to rank Web documents usually assume a document’s relevance is independent of other documents. This leads to learned ranking functions that produce rankings with redundant results. In contrast, user studies have shown that diversity at high ranks is often preferred. We presen ..."
Abstract

Cited by 56 (4 self)
 Add to MetaCart
Algorithms for learning to rank Web documents usually assume a document’s relevance is independent of other documents. This leads to learned ranking functions that produce rankings with redundant results. In contrast, user studies have shown that diversity at high ranks is often preferred. We present two online learning algorithms that directly learn a diverse ranking of documents based on users ’ clicking behavior. We show that these algorithms minimize abandonment, or alternatively, maximize the probability that a relevant document is found in the top k positions of a ranking. Moreover, one of our algorithms asymptotically achieves optimal worstcase performance even if users’ interests change. 1.
Information compression and retention in dynamical processes
, 2001
"... We discuss some recent work on various constructions that accumulate or remove information within dynamical systems: tail fields, numeration systems and formal languages (especially of betashifts), and factor mappings between symbolic or tiling dynamical systems. ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We discuss some recent work on various constructions that accumulate or remove information within dynamical systems: tail fields, numeration systems and formal languages (especially of betashifts), and factor mappings between symbolic or tiling dynamical systems.
MADBayes: MAPbased Asymptotic Derivations from Bayes
"... The classical mixture of Gaussians model is related to Kmeans via smallvariance asymptotics: as the covariances of the Gaussians tend to zero, the negative loglikelihood of the mixture of Gaussians model approaches the Kmeans objective, and the EM algorithm approaches the Kmeans algorithm. Kuli ..."
Abstract
 Add to MetaCart
The classical mixture of Gaussians model is related to Kmeans via smallvariance asymptotics: as the covariances of the Gaussians tend to zero, the negative loglikelihood of the mixture of Gaussians model approaches the Kmeans objective, and the EM algorithm approaches the Kmeans algorithm. Kulis & Jordan (2012) used this observation to obtain a novel Kmeanslike algorithm from a Gibbs sampler for the Dirichlet process (DP) mixture. We instead consider applying smallvariance asymptotics directly to the posterior in Bayesian nonparametric models. This framework is independent of any specific Bayesian inference algorithm, and it has the major advantage that it generalizes immediately to a range of models beyond the DP mixture. To illustrate, we apply our framework to the feature learning setting, where the beta process and Indian buffet process provide an appropriate Bayesian nonparametric prior. We obtain a novel objective function that goes beyond clustering to learn (and penalize new) groupings for which we relax the mutual exclusivity and exhaustivity assumptions of clustering. We demonstrate several other algorithms, all of which are scalable and simple to implement. Empirical results demonstrate the benefits of the new framework. Proceedings of the 30 th