Results 1  10
of
134
Latent dirichlet allocation
 Journal of Machine Learning Research
, 2003
"... We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a threelevel hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, ..."
Abstract

Cited by 2634 (66 self)
 Add to MetaCart
(Show Context)
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a threelevel hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model. 1.
Modeling annotated data
 In Proc. of the 26th Intl. ACM SIGIR Conference
, 2003
"... We consider the problem of modeling annotated data—data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We describe three hierarchical probabilistic mixture models that are aimed at such data, culminating in the Cor ..."
Abstract

Cited by 344 (11 self)
 Add to MetaCart
We consider the problem of modeling annotated data—data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We describe three hierarchical probabilistic mixture models that are aimed at such data, culminating in the CorrLDA model, a latent variable model that is effective at modeling the joint distribution of both types and the conditional distribution of the annotation given the primary type. We take an empirical Bayes approach to finding parameter estimates and conduct experiments in heldout likelihood, automatic annotation, and textbased image retrieval using the Corel database of images and captions. 1
Prior distributions for variance parameters in hierarchical models
 Bayesian Analysis
, 2006
"... Various noninformative prior distributions have been suggested for scale parameters in hierarchical models. We construct a new foldednoncentralt family of conditionally conjugate priors for hierarchical standard deviation parameters, and then consider noninformative and weakly informative priors i ..."
Abstract

Cited by 179 (13 self)
 Add to MetaCart
(Show Context)
Various noninformative prior distributions have been suggested for scale parameters in hierarchical models. We construct a new foldednoncentralt family of conditionally conjugate priors for hierarchical standard deviation parameters, and then consider noninformative and weakly informative priors in this family. We use an example to illustrate serious problems with the inversegamma family of “noninformative ” prior distributions. We suggest instead to use a uniform prior on the hierarchical standard deviation, using the halft family when the number of groups is small and in other settings where a weakly informative prior is desired.
A Shrinkage Approach to LargeScale Covariance Matrix Estimation and Implications for Functional Genomics
, 2005
"... ..."
How Many Iterations in the Gibbs Sampler?
 In Bayesian Statistics 4
, 1992
"... When the Gibbs sampler is used to estimate posterior distributions (Gelfand and Smith, 1990), the question of how many iterations are required is central to its implementation. When interest focuses on quantiles of functionals of the posterior distribution, we describe an easilyimplemented metho ..."
Abstract

Cited by 116 (6 self)
 Add to MetaCart
(Show Context)
When the Gibbs sampler is used to estimate posterior distributions (Gelfand and Smith, 1990), the question of how many iterations are required is central to its implementation. When interest focuses on quantiles of functionals of the posterior distribution, we describe an easilyimplemented method for determining the total number of iterations required, and also the number of initial iterations that should be discarded to allow for "burnin". The method uses only the Gibbs iterates themselves, and does not, for example, require external specification of characteristics of the posterior density. Here the method is described for the situation where one long run is generated, but it can also be easily applied if there are several runs from different starting points. It also applies more generally to Markov chain Monte Carlo schemes other than the Gibbs sampler. It can also be used when several quantiles are to be estimated, when the quantities of interest are probabilities rath...
An overview of the logic and rationale of hierarchical linear models
 Journal of Management
, 1997
"... On behalf of: ..."
(Show Context)
Modeling Multilevel Data Structures
 AMERICAN JOURNAL OF POLITICAL SCIENCE
, 1997
"... Although integrating multiple levels of data into an analysis can often yield better inferences about the phenomenon under study, traditional methodologies used to combine multiple levels of data are problematic. In this paper, we discuss several methodologies under the rubric of multilevel analys ..."
Abstract

Cited by 52 (0 self)
 Add to MetaCart
Although integrating multiple levels of data into an analysis can often yield better inferences about the phenomenon under study, traditional methodologies used to combine multiple levels of data are problematic. In this paper, we discuss several methodologies under the rubric of multilevel analysis. Multilevel methods, we argue, provide researchers, particularly researchers using comparative data, substantial leverage in overcoming the typical problems associated with either ignoring multiple levels of data, or problems associated with combining lowerlevel and higherlevel data (including overcoming implicit assumptions of fixed and constant effects). The paper discusses several variants of the multilevel model and provides an application of individuallevel support for European integration using comparative political data from Western Europe.
Testing for Convergence Clubs in Income percapita: A Predictive Density Approach.
 International Economic Review
, 1999
"... The paper proposes a technique to jointly tests for groupings of unknown size in the cross sectional dimension of a panel and estimates the parametersofeach group, and applies it to identifying convergence clubs in income percapita.The approach uses the predictive densityofthedata, conditional o ..."
Abstract

Cited by 52 (2 self)
 Add to MetaCart
The paper proposes a technique to jointly tests for groupings of unknown size in the cross sectional dimension of a panel and estimates the parametersofeach group, and applies it to identifying convergence clubs in income percapita.The approach uses the predictive densityofthedata, conditional on the parameters of the model. The steady state distribution of European regional data clustersaround four polesofattraction with differenteconomic features. The distribution of income percapita of OECD countries has twopolesofattraction and each group has clearly identifiable economic characteristics. JEL Classification No.: C11, D90, O47 Key words: Heterogeneities, Panel Data, Predictive Density, Income Inequality. 3 Iwouldlike to thank three anonymous referees, the editor of this journal, Bruce Hansen, Hashem Pesaran, Russell Cooper, Christopher Croux, Albert Marcet and the participants atseminars at Universitat Pompeu Fabra, the University of Southampton, Universite de Paris IM...
Spatiallyadaptive penalties for spline fitting
 Australian and New Zealand Journal of Statistics
, 2000
"... We study spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. Our estimates are pth degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
We study spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. Our estimates are pth degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty on the jumps of the pth derivative at the knots. To be spatially adaptive, the logarithm of the penalty is itself a linear spline but with relatively few knots and with values at the knots chosen to minimize GCV. This locallyadaptive spline estimator is compared with other spline estimators in the literature such as cubic smoothing splines and knotselection techniques for leastsquares regression. Our estimator can be interpreted as an empirical Bayes estimate for a prior allowing spatial heterogeneity. In cases of spatially heterogeneous regression functions,
Bayesian estimation of a multilevel IRT model using Gibbs sampling
 Psychometrika
, 2001
"... In this article, atwolevel regression model is imposed on the ability parameters in an item response theory (IRT) model. The advantage of using latent rather an observed scores as dependent variables of a multilevel model is that it offers the possibility of separating the influence of item difficu ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
In this article, atwolevel regression model is imposed on the ability parameters in an item response theory (IRT) model. The advantage of using latent rather an observed scores as dependent variables of a multilevel model is that it offers the possibility of separating the influence of item difficulty and ability level and modeling response variation and measurement rror. Another advantage is that, contrary to observed scores, latent scores are testindependent, which offers the possibility of using results from different tests in one analysis where the parameters of the IRT model and the multilevel model can be concurrently estimated. The twoparameter no mal ogive model is used for the IRT measurement model. It will be shown that he parameters of the twoparameter normal ogive model and the multilevel model can be estimated in a Bayesian framework using Gibbs sampling. Examples using simulated and real data are given.