Results 1 - 10
of
12
Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality
- In Proc. of the Europ. Chap. of the Assoc. for
, 2014
"... Topic models based on latent Dirichlet al-location and related methods are used in a range of user-focused tasks including doc-ument navigation and trend analysis, but evaluation of the intrinsic quality of the topic model and topics remains an open research area. In this work, we explore the two ta ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Topic models based on latent Dirichlet al-location and related methods are used in a range of user-focused tasks including doc-ument navigation and trend analysis, but evaluation of the intrinsic quality of the topic model and topics remains an open research area. In this work, we explore the two tasks of automatic evaluation of single topics and automatic evaluation of whole topic models, and provide recom-mendations on the best strategy for per-forming the two tasks, in addition to pro-viding an open-source toolkit for topic and topic model evaluation. 1
Automatic Labelling of Topic Models Learned from Twitter by
"... Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hidden thematic structures which provide further insights into the data. The automatic labelling of such topics derived from social media poses however new challenges since topics may characterise novel ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hidden thematic structures which provide further insights into the data. The automatic labelling of such topics derived from social media poses however new challenges since topics may characterise novel events happening in the real world. Existing automatic topic la-belling approaches which depend on exter-nal knowledge sources become less appli-cable here since relevant articles/concepts of the extracted topics may not exist in ex-ternal sources. In this paper we propose to address the problem of automatic la-belling of latent topics learned from Twit-ter as a summarisation problem. We in-troduce a framework which apply sum-marisation algorithms to generate topic la-bels. These algorithms are independent of external sources and only rely on the identification of dominant terms in doc-uments related to the latent topic. We compare the efficiency of existing state of the art summarisation algorithms. Our results suggest that summarisation algo-rithms generate better topic labels which capture event-related context compared to the top-n terms returned by LDA. 1
Evaluating topic representations for exploring document collections
- JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY
, 2015
"... Topic models have been shown to be a useful way of representing the content of large document collections, for example, via visualization interfaces (topic browsers). These systems enable users to explore collections by way of latent topics. A standard way to represent a topic is using a term list; ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Topic models have been shown to be a useful way of representing the content of large document collections, for example, via visualization interfaces (topic browsers). These systems enable users to explore collections by way of latent topics. A standard way to represent a topic is using a term list; that is the top-n words with highest conditional probability within the topic. Other topic representations such as textual and image labels also have been proposed. However, there has been no comparison of these alternative representations. In this article, we compare 3 different topic representations in a document retrieval task. Participants were asked to retrieve relevant documents based on predefined queries within a fixed time limit, presenting topics in one of the following modalities: (a) lists of terms, (b) textual phrase labels, and (c) image labels. Results show that textual labels are easier for users to interpret than are term lists and image labels. Moreover, the precision of retrieved documents for textual and image labels is com-parable to the precision achieved by representing topics using term lists, demonstrating that labeling methods are an effective alternative topic representation.
Measuring the Similarity between Automatically Generated Topics
"... Previous approaches to the problem of measuring similarity between automati-cally generated topics have been based on comparison of the topics ’ word probability distributions. This paper presents alterna-tive approaches, including ones based on distributional semantics and knowledge-based measures, ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Previous approaches to the problem of measuring similarity between automati-cally generated topics have been based on comparison of the topics ’ word probability distributions. This paper presents alterna-tive approaches, including ones based on distributional semantics and knowledge-based measures, evaluated by compari-son with human judgements. The best performing methods provide reliable esti-mates of topic similarity comparable with human performance and should be used in preference to the word probability distri-bution measures used previously. 1
Capturing Semantically Meaningful Word Dependencies with an Admixture of Poisson MRFs
"... We develop a fast algorithm for the Admixture of Poisson MRFs (APM) topic model [1] and propose a novel metric to directly evaluate this model. The APM topic model recently introduced by Inouye et al. [1] is the first topic model that allows for word dependencies within each topic unlike in previous ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We develop a fast algorithm for the Admixture of Poisson MRFs (APM) topic model [1] and propose a novel metric to directly evaluate this model. The APM topic model recently introduced by Inouye et al. [1] is the first topic model that allows for word dependencies within each topic unlike in previous topic models like LDA that assume independence between words within a topic. Research in both the semantic coherence of a topic models [2, 3, 4, 5] and measures of model fitness [6] provide strong support that explicitly modeling word dependencies—as in APM—could be both semantically meaningful and essential for appropriately modeling real text data. Though APM shows significant promise for providing a better topic model, APM has a high computational complexity because O(p2) parameters must be estimated where p is the number of words ([1] could only provide results for datasets with p = 200). In light of this, we develop a paral-lel alternating Newton-like algorithm for training the APM model that can han-dle p = 104 as an important step towards scaling to large datasets. In addition, Inouye et al. [1] only provided tentative and inconclusive results on the utility of APM. Thus, motivated by simple intuitions and previous evaluations of topic models, we propose a novel evaluation metric based on human evocation scores between word pairs (i.e. how much one word “brings to mind ” another word [7]). We provide compelling quantitative and qualitative results on the BNC corpus that demonstrate the superiority of APM over previous topic models for identi-fying semantically meaningful word dependencies. (MATLAB code available at:
Fixed-Length Poisson MRF: Adding Dependencies to the Multinomial
"... Abstract We propose a novel distribution that generalizes the Multinomial distribution to enable dependencies between dimensions. Our novel distribution is based on the parametric form of the Poisson MRF model ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract We propose a novel distribution that generalizes the Multinomial distribution to enable dependencies between dimensions. Our novel distribution is based on the parametric form of the Poisson MRF model
PU TrendMiner Consortium This document is part of the TrendMiner research project (No. 287863), partially funded by the FP7-ICT Programme.
, 2013
"... D3.2.1 Clustering models for discovery of regional and demographic variation ..."
Abstract
- Add to MetaCart
(Show Context)
D3.2.1 Clustering models for discovery of regional and demographic variation
unknown title
"... rs aig elin ch ed ity o prop d b terms of both their coherence and associated generality, using a combination of existing and new mea-e disco ent co et alloc metho of the terms used to describe a particular topic, despite the obser-vation that evaluation methods such as perplexity are often not corr ..."
Abstract
- Add to MetaCart
rs aig elin ch ed ity o prop d b terms of both their coherence and associated generality, using a combination of existing and new mea-e disco ent co et alloc metho of the terms used to describe a particular topic, despite the obser-vation that evaluation methods such as perplexity are often not correlated with human judgements of topic quality (Chang, Boyd-Graber, Gerrish, Wang, & Blei, 2009). However, a number of measures have been proposed in recent years for the measurement iptors can b listic mod example, using the top N highest-ranked terms from an NM basis vector. In our previous work, we generated topics usin LDA and NMF with two particular corpora, where a qua analysis of the corresponding termdescriptors found themost read-ily-interpretable topics to be discovered by NMF (O’Callaghan, Greene, Conway, Carthy, & Cunningham, 2013). An example of the issues we encountered can be illustrated with the following topics thatwere discovered by LDA andNMF for the same value of kwithin a corpus of online news articles (described in further detail in
Summarizing topical contents from PubMed documents using a thematic analysis
"... Improving the search and browsing ex-perience in PubMedr is a key compo-nent in helping users detect information of interest. In particular, when explor-ing a novel field, it is important to pro-vide a comprehensive view for a specific subject. One solution for providing this panoramic picture is to ..."
Abstract
- Add to MetaCart
(Show Context)
Improving the search and browsing ex-perience in PubMedr is a key compo-nent in helping users detect information of interest. In particular, when explor-ing a novel field, it is important to pro-vide a comprehensive view for a specific subject. One solution for providing this panoramic picture is to find sub-topics from a set of documents. We propose a method that finds sub-topics that we refer to as themes and computes representative titles based on a set of documents in each theme. The method combines a thematic clustering algorithm and the Pool Adja-cent Violators algorithm to induce signifi-cant themes. Then, for each theme, a title is computed using PubMed document ti-tles and theme-dependent term scores. We tested our system on five disease sets from OMIMr and evaluated the results based on normalized point-wise mutual informa-tion and MeSHr terms. For both perfor-mance measures, the proposed approach outperformed LDA. The quality of theme titles were also evaluated by comparing them with manually created titles. 1
Text, Topics, and Turkers: A Consensus Measure for Statistical Topics
"... Topic modeling is an important tool in social media anal-ysis, allowing researchers to quickly understand large text corpora by investigating the topics underlying them. One of the fundamental problems of topic models lies in how to assess the quality of the topics from the perspective of human inte ..."
Abstract
- Add to MetaCart
(Show Context)
Topic modeling is an important tool in social media anal-ysis, allowing researchers to quickly understand large text corpora by investigating the topics underlying them. One of the fundamental problems of topic models lies in how to assess the quality of the topics from the perspective of human interpretability. How well can humans understand the meaning of topics generated by statistical topic model-ing algorithms? In this work we advance the study of this question by introducing Topic Consensus: a new measure that calculates the quality of a topic through investigating its consensus with some known topics underlying the data. We view the quality of the topics from three perspectives: 1) topic interpretability, 2) how documents relate to the under-lying topics, and 3) how interpretable the topics are when the corpus has an underlying categorization. We provide in-sights into how well the results of Mechanical Turk match automated methods for calculating topic quality. The prob-ability distribution of the words in the topic best fit the Topic Coherence measure, in terms of both correlation as well as finding the best topics.