MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Latent Dirichlet Allocation (2001) [485 citations — 23 self]

Abstract:

We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hofmann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

Citations

1636 Indexing by latent semantic analysis – Deerwester, Dumais, et al. - 1990
1439 Modern Information Retrieval – Baeza-Yates, Ribeiro - 1999
805 Making large-scale SVM learning practical – Joachims - 1999
606 Bayesian Data Analysis – Gelman, Carlin, et al. - 1995
495 Text classification from labeled and unlabeled documents using em – Nigam, McCallum, et al. - 2000
494 Statistical methods for speech recognition – Jelinek - 1997
464 An introduction to variational methods for graphical models – Jordan, Ghahramani, et al. - 1999
357 Learning in Graphical Models – Jordan - 1998
305 Probabilistic latent semantic indexing – Hofmann - 1999
158 Latent semantic indexing: A probabilistic analysis – Papadimitriou, Tamaki, et al. - 1998
157 Using maximum entropy for text classification – Nigam, Lafferty, et al. - 1999
149 Overview of the first text retrieval conference (TREC-1 – Harman - 1992
125 2003, ‘Modeling annotated data – Blei, Jordan
98 A variational bayesian framework for graphical models – Attias - 1999
69 Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments – BLEI, Popescul, et al. - 2001
60 An experimental comparison of several clustering and initialization methods – Meila, Heckerman - 1998
55 Expectation-propagation for the generative aspect model. Uncertainty – Minka, Lafferty - 2007
47 Improving multi-class text classification with naive Bayes – Rennie - 2001
45 Estimating a dirichlet distribution – Minka - 2000
38 Parametric empirical Bayes inference: Theory and applications – MORRIS - 1983
31 A probabilistic approach to semantic representation – Griffiths, Steyvers - 2002
26 Approximate Bayesian Inference in Conditionally Independent Hierarchical Models (Parametric Empirical Bayes Models – Kass, Steey - 1989
16 Recent progress on de Finetti’s notions of exchangeability – Diaconis - 1988
5 Exchangeability and related topics. In Ecole d' et e de probabilit es de Saint-Flour, XIII – Aldous - 1983
4 Bayesian methods for censored categorical data – Dickey, Jiang, et al. - 1987
3 Finetti. Theory of probability. Vol – de - 1990
2 Caenorrhabditis genetic center bibliography – Avery - 2002