Results 11 -
12 of
12
Representing Topics Labels for Exploring Digital Libraries
"... Topic models have been shown to be a useful way of rep-resenting the content of large document collections, for ex-ample via visualisation interfaces (topic browsers). These systems enable users to explore collections by way of latent topics. A standard way to represent a topic is using a set of key ..."
Abstract
- Add to MetaCart
(Show Context)
Topic models have been shown to be a useful way of rep-resenting the content of large document collections, for ex-ample via visualisation interfaces (topic browsers). These systems enable users to explore collections by way of latent topics. A standard way to represent a topic is using a set of keywords, i.e. the top-n words with highest marginal prob-ability within the topic. However, alternative topic repre-sentations have been proposed, including textual and image labels. In this paper, we compare different topic representa-tions, i.e. sets of topic words, textual phrases and images, in a document retrieval task. We asked participants to retrieve relevant documents based on pre-defined queries within a fixed time limit, presenting topics in one of the following modalities: (1) sets of keywords, (2) textual labels, and (3) image labels. Our results show that textual labels are eas-ier for users to interpret than keywords and image labels. Moreover, the precision of retrieved documents for textual and image labels is comparable to the precision achieved by representing topics using sets of keywords, demonstrat-ing that labelling methods are an effective alternative topic representation.
leipzig.de Andreas Both
"... halle.de Quantifying the coherence of a set of statements is a long standing problem with many potential applications that has attracted researchers from different sciences. The special case of measuring coherence of topics has been recently stud-ied to remedy the problem that topic models give no g ..."
Abstract
- Add to MetaCart
(Show Context)
halle.de Quantifying the coherence of a set of statements is a long standing problem with many potential applications that has attracted researchers from different sciences. The special case of measuring coherence of topics has been recently stud-ied to remedy the problem that topic models give no guar-anty on the interpretablity of their output. Several bench-mark datasets were produced that record human judgements of the interpretability of topics. We are the first to propose a framework that allows to construct existing word based coherence measures as well as new ones by combining ele-mentary components. We conduct a systematic search of the space of coherence measures using all publicly available topic relevance data for the evaluation. Our results show that new combinations of components outperform existing measures with respect to correlation to human ratings. Finally, we outline how our results can be transferred to further appli-cations in the context of text mining, information retrieval and the world wide web.