Results 1  10
of
66
Dynamic topic models
 In ICML
, 2006
"... Scientists need new tools to explore and browse large collections of scholarly literature. Thanks to organizations such as JSTOR, which scan and index the original bound archives of many journals, modern scientists can search digital libraries spanning hundreds of years. A scientist, suddenly ..."
Abstract

Cited by 588 (23 self)
 Add to MetaCart
(Show Context)
Scientists need new tools to explore and browse large collections of scholarly literature. Thanks to organizations such as JSTOR, which scan and index the original bound archives of many journals, modern scientists can search digital libraries spanning hundreds of years. A scientist, suddenly
Topics in semantic representation
 Psychological Review
, 2007
"... Accounts of language processing have suggested that it requires retrieving concepts from memory in response to an ongoing stream of information. This can be facilitated by inferring the gist of a sentence, conversation, or document, and using that computational problem underlying the extraction and ..."
Abstract

Cited by 157 (14 self)
 Add to MetaCart
Accounts of language processing have suggested that it requires retrieving concepts from memory in response to an ongoing stream of information. This can be facilitated by inferring the gist of a sentence, conversation, or document, and using that computational problem underlying the extraction and use of gist, formulating this problem as a rational statistical inference. This leads us to a novel approach to semantic representation in which word meanings are represented in terms of a set of probabilistic topics. The topic model performs well in predicting word association and the effects of semantic association and ambiguity on a variety of language processing and memory tasks. It also provides a foundation for developing more richly structured statistical models of language, as the generative process assumed in the topic model can easily be extended to incorporate other kinds of semantic and syntactic structure. Many aspects of perception and cognition can be understood by considering the computational problem that is addressed by a particular human capacity (Andersion, 1990; Marr, 1982). Perceptual capacities such as identifying shape from shading (Freeman, 1994), motion perception
A collapsed variational bayesian inference algorithm for latent dirichlet allocation
 In NIPS
"... Latent Dirichlet allocation (LDA) is a Bayesian network that has recently gained much popularity in applications ranging from document modeling to computer vision. Due to the large scale nature of these applications, current inference procedures like variational Bayes and Gibbs sampling have been fo ..."
Abstract

Cited by 111 (8 self)
 Add to MetaCart
(Show Context)
Latent Dirichlet allocation (LDA) is a Bayesian network that has recently gained much popularity in applications ranging from document modeling to computer vision. Due to the large scale nature of these applications, current inference procedures like variational Bayes and Gibbs sampling have been found lacking. In this paper we propose the collapsed variational Bayesian inference algorithm for LDA, and show that it is computationally efficient, easy to implement and significantly more accurate than standard variational Bayesian inference for LDA. 1
Continuous Time Dynamic Topic Models
"... In this paper, we develop the continuous time dynamic topic model (cDTM). The cDTM is a dynamic topic model that uses Brownian motion to model the latent topics through a sequential collection of documents, where a “topic ” is a pattern of word use that we expect to evolve over the course of the col ..."
Abstract

Cited by 81 (7 self)
 Add to MetaCart
(Show Context)
In this paper, we develop the continuous time dynamic topic model (cDTM). The cDTM is a dynamic topic model that uses Brownian motion to model the latent topics through a sequential collection of documents, where a “topic ” is a pattern of word use that we expect to evolve over the course of the collection. We derive an efficient variational approximate inference algorithm that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points. In contrast to the cDTM, the original discretetime dynamic topic model (dDTM) requires that time be discretized. Moreover, the complexity of variational inference for the dDTM grows quickly as time granularity increases, a drawback which limits finegrained discretization. We demonstrate the cDTM on two news corpora, reporting both predictive perplexity and the novel task of time stamp prediction. 1
Multiway distributional clustering via pairwise interactions
 In ICML
, 2005
"... We present a novel unsupervised learning scheme that simultaneously clusters variables of several types (e.g., documents, words and authors) based on pairwise interactions between the types, as observed in cooccurrence data. In this scheme, multiple clustering systems are generated aiming at maximi ..."
Abstract

Cited by 61 (10 self)
 Add to MetaCart
(Show Context)
We present a novel unsupervised learning scheme that simultaneously clusters variables of several types (e.g., documents, words and authors) based on pairwise interactions between the types, as observed in cooccurrence data. In this scheme, multiple clustering systems are generated aiming at maximizing an objective function that measures multiple pairwise mutual information between cluster variables. To implement this idea, we propose an algorithm that interleaves topdown clustering of some variables and bottomup clustering of the other variables, with a local optimization correction routine. Focusing on document clustering we present an extensive empirical study of twoway, threeway and fourway applications of our scheme using six realworld datasets including the 20 Newsgroups (20NG) and the Enron email collection. Our multiway distributional clustering (MDC) algorithms consistently and significantly outperform previous stateoftheart information theoretic clustering algorithms. 1.
Collapsed Variational Inference for HDP
"... A wide variety of Dirichletmultinomial ‘topic ’ models have found interesting applications in recent years. While Gibbs sampling remains an important method of inference in such models, variational techniques have certain advantages such as easy assessment of convergence, easy optimization without ..."
Abstract

Cited by 54 (1 self)
 Add to MetaCart
(Show Context)
A wide variety of Dirichletmultinomial ‘topic ’ models have found interesting applications in recent years. While Gibbs sampling remains an important method of inference in such models, variational techniques have certain advantages such as easy assessment of convergence, easy optimization without the need to maintain detailed balance, a bound on the marginal likelihood, and sidestepping of issues with topicidentifiability. The most accurate variational technique thus far, namely collapsed variational latent Dirichlet allocation, did not deal with model selection nor did it include inference for hyperparameters. We address both issues by generalizing the technique, obtaining the first variational algorithm to deal with the hierarchical Dirichlet process and to deal with hyperparameters of Dirichlet variables. Experiments show a significant improvement in accuracy. 1
Discrete Component Analysis
 Subspace, Latent Structure and Feature Selection Techniques
, 2006
"... This article presents a unified theory for analysis of components in discrete data, and compares the methods with techniques such as independent component analysis, nonnegative matrix factorisation and latent Dirichlet allocation. The main families of algorithms discussed are a variational appr ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
(Show Context)
This article presents a unified theory for analysis of components in discrete data, and compares the methods with techniques such as independent component analysis, nonnegative matrix factorisation and latent Dirichlet allocation. The main families of algorithms discussed are a variational approximation, Gibbs sampling, and RaoBlackwellised Gibbs sampling. Applications are presented for voting records from the United States Senate for 2003, and for the Reuters21578 newswire collection.
The discrete basis problem
, 2005
"... We consider the Discrete Basis Problem, which can be described as follows: given a collection of Boolean vectors find a collection of k Boolean basis vectors such that the original vectors can be represented using disjunctions of these basis vectors. We show that the decision version of this problem ..."
Abstract

Cited by 34 (12 self)
 Add to MetaCart
We consider the Discrete Basis Problem, which can be described as follows: given a collection of Boolean vectors find a collection of k Boolean basis vectors such that the original vectors can be represented using disjunctions of these basis vectors. We show that the decision version of this problem is NPcomplete and that the optimization version cannot be approximated within any finite ratio. We also study two variations of this problem, where the Boolean basis vectors must be mutually otrhogonal. We show that the other variation is closely related with the wellknown Metric kmedian Problem in Boolean space. To solve these problems, two algorithms will be presented. One is designed for the variations mentioned above, and it is solely based on solving the kmedian problem, while another is a heuristic intended to solve the general Discrete Basis Problem. We will also study the results of extensive experiments made with these two algorithms with both synthetic and realworld data. The results are twofold: with the synthetic data, the algorithms did rather well, but with the realworld data the results were not as good.
Semisupervised sequence modeling with syntactic topic models
 AAAI05, The Twentieth National Conference on Artificial Intelligence
, 2005
"... Although there has been significant previous work on semisupervised learning for classification, there has been relatively little in sequence modeling. This paper presents an approach that leverages recent work in manifoldlearning on sequences to discover word clusters from language data, includin ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
Although there has been significant previous work on semisupervised learning for classification, there has been relatively little in sequence modeling. This paper presents an approach that leverages recent work in manifoldlearning on sequences to discover word clusters from language data, including both syntactic classes and semantic topics. From unlabeled data we form a smooth, lowdimensional feature space, where each word token is projected based on its underlying role as a function or content word. We then use this projection as additional input features to a linearchain conditional random field trained on limited labeled training data. On standard partofspeech tagging and Chinese word segmentation data sets we show as much as 14 % error reduction due to the unlabeled data, and also statisticallysignificant improvements over a related semisupervised sequence tagging method due to Miller et al. 1.