#### DMCA

## Improving topic coherence with regularized topic models (2011)

### Cached

### Download Links

Venue: | In Proc. of NIPS |

Citations: | 20 - 4 self |

### Citations

4358 | Latent Dirichlet allocation
- Blei, Ng, et al.
(Show Context)
Citation Context ... regularizer (QUAD-REG) and (b) a convolved Dirichlet regularizer (CONV-REG). We start by introducing the standard notation in topic modeling and the baseline latent Dirichlet allocation method (LDA, =-=[4, 9]-=-). 3.1 Topic Modeling and LDA Topic models are a Bayesian version of probabilistic latent semantic analysis [11]. In standard LDA topic modeling each of D documents in the corpus is modeled as a discr... |

1223 | Probabilistic latent semantic indexing
- Hofmann
- 1999
(Show Context)
Citation Context ... notation in topic modeling and the baseline latent Dirichlet allocation method (LDA, [4, 9]). 3.1 Topic Modeling and LDA Topic models are a Bayesian version of probabilistic latent semantic analysis =-=[11]-=-. In standard LDA topic modeling each of D documents in the corpus is modeled as a discrete distribution over T latent topics, and each topic is a discrete distribution over the vocabulary of W words.... |

678 | Topic models
- Blei, Lafferty
- 2009
(Show Context)
Citation Context ...te: ( 1 φw|t ← Nwt + 2ν Nt + 2ν φ ∑W w|t i=1 Ciwφi|t φ T ) . (8) t Cφt We note that unlike other topic models in which a covariance or correlation structure is used (as in the correlated topic model, =-=[3]-=-) in the context of correlated priors for θ t|d, our method does not require the inversion of C, which would be impractical for even modest vocabulary sizes. By using the update in Equation (8) we obt... |

244 |
LDA-based document models for ad-hoc retrieval.
- Wei, Croft
- 2006
(Show Context)
Citation Context ...ch, discover, and organize online content by automatically extracting semantic themes from collections of text documents. Learned topics can be useful in user interfaces for ad-hoc document retrieval =-=[18]-=-; driving faceted browsing [14]; clustering search results [19]; or improving display of search results by increasing result diversity [10]. When the text being modeled is plentiful, clear and well wr... |

237 | Reading tea leaves: How humans interpret topic models
- Chang, Boyd-Graber, et al.
- 2009
(Show Context)
Citation Context ...fit for use in user interfaces. However, topics are not always consistently coherent, and even with relatively well written text, one can learn topics that are a mix of concepts or hard to understand =-=[1, 6]-=-. This problem is exacerbated for content that is sparse or noisy, such as blog posts, tweets, or web search result snippets. Take for instance the task of learning categories in clustering search eng... |

195 | Learning to cluster web search result.
- Zeng, He, et al.
- 2004
(Show Context)
Citation Context ...acting semantic themes from collections of text documents. Learned topics can be useful in user interfaces for ad-hoc document retrieval [18]; driving faceted browsing [14]; clustering search results =-=[19]-=-; or improving display of search results by increasing result diversity [10]. When the text being modeled is plentiful, clear and well written (e.g. large collections of abstracts from scientific lite... |

188 |
Probabilistic topic models
- Steyvers, Griffiths
- 2006
(Show Context)
Citation Context ... regularizer (QUAD-REG) and (b) a convolved Dirichlet regularizer (CONV-REG). We start by introducing the standard notation in topic modeling and the baseline latent Dirichlet allocation method (LDA, =-=[4, 9]-=-). 3.1 Topic Modeling and LDA Topic models are a Bayesian version of probabilistic latent semantic analysis [11]. In standard LDA topic modeling each of D documents in the corpus is modeled as a discr... |

109 | Rethinking LDA: why priors matter
- Wallach, Mimno, et al.
- 2009
(Show Context)
Citation Context ...ears that their method relies on interactive feedback from the user or on the careful selection of words within an ontological concept. The effect of structured priors in LDA has been investigated by =-=[17]-=- who showed that learning hierarchical Dirichlet priors over the document-topic distribution can provide better performance than using a symmetric prior. Our work is motivated by the fact that priors ... |

102 | Topic modeling with network regularization
- Mei, Cai, et al.
- 2008
(Show Context)
Citation Context ...r priors for φ w|t to those studied by [3] would be unfeasible as they would require the inverse of a W × W covariance matrix. Network structures associated with a collection of documents are used in =-=[12]-=- in order to “smooth” the topic distributions of the PLSA model [11]. Our methods are different in that they do not require the collection under study to have an associated network structure as we aim... |

85 | Automatic evaluation of topic coherence
- Newman, Lau, et al.
- 2010
(Show Context)
Citation Context ...ce more coherent and interpretable topics. Our work is predicated on recent evidence that a pointwise mutual information-based score (PMI-Score) is highly correlated with human-judged topic coherence =-=[15, 16]-=-. We develop two Bayesian regularization formulations that are designed to improve PMI-Score. We experiment with five search result datasets from 7M Blog posts, four search result datasets from 1M New... |

82 | A survey of Web clustering engines,"
- Carpineto
- 2009
(Show Context)
Citation Context ...of learning categories in clustering search engine results. A few searches with Carrot2, Yippee, or WebClust quickly demonstrate that consistently learning meaningful topic facets is a difficult task =-=[5]-=-. Our goal in this paper is to improve the coherence, interpretability and ultimate usability of learned topics. To achieve this we propose QUAD-REG and CONV-REG, two new methods for regularizing topi... |

80 | Optimizing semantic coherence in topic models
- Mimno, Wallach, et al.
- 2011
(Show Context)
Citation Context ...larizing topic models on small or noisy collections. Additionally, their work is focused on regularizing the document-topic distributions instead of the word-topic distributions. Finally, the work in =-=[13]-=-, contemporary to ours, also addresses the problem of improving the quality of topic models. However, our approach focuses on exploiting the knowledge provided by external data given the noisy and/or ... |

56 | Incorporating domain knowledge into topic modeling via Dirichlet forest priors
- Andrzejewski, Zhu, et al.
- 2009
(Show Context)
Citation Context ... are not interested in constraining the learned topics to those in the external data but rather in improving the topics in small or noisy collections by means of regularization. Along a similar vein, =-=[2]-=- incorporate domain knowledge into topic models by encouraging some word pairs to have similar probability within a topic. Their method, as ours, is based on replacing the standard Dirichlet prior ove... |

47 |
Probabilistic topic models. In Latent Semantic Analysis: A Road to Meaning . Laurence Erlbaum
- Steyvers, Griffiths
- 2007
(Show Context)
Citation Context ...of an individual topic, or for a topic model of T topics (in that case PMI-Score will refer to the average of T PMI-Scores). This PMI-Score – and the idea of using external data to measure it – forms the foundation of our idea for regularization. 3 Regularized Topic Models In this section we describe our approach to regularization in topic models by proposing two different methods: (a) a quadratic regularizer (QUAD-REG) and (b) a convolved Dirichlet regularizer (CONV-REG). We start by introducing the standard notation in topic modeling and the baseline latent Dirichlet allocation method (LDA, [4, 9]). 3.1 Topic Modeling and LDA Topic models are a Bayesian version of probabilistic latent semantic analysis [11]. In standard LDA topic modeling each of D documents in the corpus is modeled as a discrete distribution over T latent topics, and each topic is a discrete distribution over the vocabulary ofW words. For document d, the distribution over topics, θt|d, is drawn from a Dirichlet distribution Dir[α]. Likewise, each distribution over words, φw|t, is drawn from a Dirichlet distribution, Dir[β]. For the ith token in a document, a topic assignment, zid, is drawn from θt|d and the word, xid,... |

31 |
Organizing the OCA: learning faceted subjects from a library of digital books
- Mimno, McCallum
- 2007
(Show Context)
Citation Context ...ne content by automatically extracting semantic themes from collections of text documents. Learned topics can be useful in user interfaces for ad-hoc document retrieval [18]; driving faceted browsing =-=[14]-=-; clustering search results [19]; or improving display of search results by increasing result diversity [10]. When the text being modeled is plentiful, clear and well written (e.g. large collections o... |

28 | Modeling documents by combining semantic concepts with unsupervised statistical learning
- Chemudugunta, Holloway, et al.
- 2008
(Show Context)
Citation Context ...omparison human evaluations and the results can be seen in Figure 4. 6 Related Work Several authors have investigated the use of domain knowledge from external sources in topic modeling. For example, =-=[7, 8]-=- propose a method for combining topic models with ontological knowledge to tag web pages. They constrain the topics in an LDA-based model to be amongst those in the given ontology. [20] also use stati... |

26 |
Evaluating topic models for digital libraries,”
- Newman, Noh, et al.
- 2010
(Show Context)
Citation Context ...ce more coherent and interpretable topics. Our work is predicated on recent evidence that a pointwise mutual information-based score (PMI-Score) is highly correlated with human-judged topic coherence =-=[15, 16]-=-. We develop two Bayesian regularization formulations that are designed to improve PMI-Score. We experiment with five search result datasets from 7M Blog posts, four search result datasets from 1M New... |

25 | Topic significance ranking of LDA generative models.
- AlSumait, Barbara, et al.
- 2009
(Show Context)
Citation Context ...fit for use in user interfaces. However, topics are not always consistently coherent, and even with relatively well written text, one can learn topics that are a mix of concepts or hard to understand =-=[1, 6]-=-. This problem is exacerbated for content that is sparse or noisy, such as blog posts, tweets, or web search result snippets. Take for instance the task of learning categories in clustering search eng... |

11 | Combining concept hierarchies and statistical topic models.
- Chemudugunta, Smyth, et al.
- 2008
(Show Context)
Citation Context ...omparison human evaluations and the results can be seen in Figure 4. 6 Related Work Several authors have investigated the use of domain knowledge from external sources in topic modeling. For example, =-=[7, 8]-=- propose a method for combining topic models with ontological knowledge to tag web pages. They constrain the topics in an LDA-based model to be amongst those in the given ontology. [20] also use stati... |

8 | Probabilistic latent maximal marginal relevance.
- Guo, Sanner
- 2010
(Show Context)
Citation Context ...n be useful in user interfaces for ad-hoc document retrieval [18]; driving faceted browsing [14]; clustering search results [19]; or improving display of search results by increasing result diversity =-=[10]-=-. When the text being modeled is plentiful, clear and well written (e.g. large collections of abstracts from scientific literature), learned topics are usually coherent, easily understood, and fit for... |

1 |
Query classification based on regularized correlated topic model.
- Zhai, Guo, et al.
- 2009
(Show Context)
Citation Context ...For example, [7, 8] propose a method for combining topic models with ontological knowledge to tag web pages. They constrain the topics in an LDA-based model to be amongst those in the given ontology. =-=[20]-=- also use statistical topic models with a predefined set of topics to address the task of query classification. Our goal is different to theirs in that we are not interested in constraining the learne... |

1 |
Deng Cai, Duo Zhang, and ChengXiang Zhai. Topic modeling with network regularization.
- Mei
- 2008
(Show Context)
Citation Context ...ime, and LDA as more coherent only 39% of the time. These results are statistically significant at 5% level of significance when performing a paired t-test on the total values across all datasets. Note that the two bars corresponding to each dataset do not add up to 100% as the remaining mass corresponds to “...Can’t decide...” responses. topic proportions (θt|d). In our approach, considering similar priors for φw|t to those studied by [3] would be unfeasible as they would require the inverse of a W ×W covariance matrix. Network structures associated with a collection of documents are used in [12] in order to “smooth” the topic distributions of the PLSA model [11]. Our methods are different in that they do not require the collection under study to have an associated network structure as we aim at addressing the different problem of regularizing topic models on small or noisy collections. Additionally, their work is focused on regularizing the document-topic distributions instead of the word-topic distributions. Finally, the work in [13], contemporary to ours, also addresses the problem of improving the quality of topic models. However, our approach focuses on exploiting the knowledge p... |