Results 1  10
of
17
On Bayesian analysis of mixtures with an unknown number of components
 INSTITUTE OF INTERNATIONAL ECONOMICS PROJECT ON INTERNATIONAL COMPETITION POLICY," COM/DAFFE/CLP/TD(94)42
, 1997
"... ..."
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 260 (24 self)
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Dealing with label switching in mixture models
 Journal of the Royal Statistical Society, Series B
, 2000
"... In a Bayesian analysis of finite mixture models, parameter estimation and clustering are sometimes less straightforward that might be expected. In particular, the common practice of estimating parameters by their posterior mean, and summarising joint posterior distributions by marginal distributions ..."
Abstract

Cited by 109 (0 self)
 Add to MetaCart
In a Bayesian analysis of finite mixture models, parameter estimation and clustering are sometimes less straightforward that might be expected. In particular, the common practice of estimating parameters by their posterior mean, and summarising joint posterior distributions by marginal distributions, often leads to nonsensical answers. This is due to the socalled “labelswitching” problem, which is caused by symmetry in the likelihood of the model parameters. A frequent response to this problem is to remove the symmetry using artificial identifiability constraints. We demonstrate that this fails in general to solve the problem, and describe an alternative class of approaches, relabelling algorithms, which arise from attempting to minimise the posterior expected loss under a class of loss functions. We describe in detail one particularly simple and general relabelling algorithm, and illustrate its success in dealing with the labelswitching problem on two examples.
Accounting for Burstiness in Topic Models
"... Many different topic models have been used successfully for a variety of applications. However, even stateoftheart topic models suffer from the important flaw that they do not capture the tendency of words to appear in bursts; it is a fundamental property of language that if a word is used once i ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Many different topic models have been used successfully for a variety of applications. However, even stateoftheart topic models suffer from the important flaw that they do not capture the tendency of words to appear in bursts; it is a fundamental property of language that if a word is used once in a document, it is more likely to be used again. We introduce a topic model that uses Dirichlet compound multinomial (DCM) distributions to model this burstiness phenomenon. On both text and nontext datasets, the new model achieves better heldout likelihood than standard latent Dirichlet allocation (LDA). It is straightforward to incorporate the DCM extension into topic models that are more complex than LDA. 1.
Sparse Bayesian infinite factor models
"... We focus on sparse modeling of highdimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk toward zero as the col ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
We focus on sparse modeling of highdimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk toward zero as the column index increases. We use our prior on a parameter expanded loadings matrix to avoid the order dependence typical in factor analysis models and develop a highly efficient Gibbs sampler that scales well as data dimensionality increases. The gain in efficiency is achieved by the joint conjugacy property of the proposed prior, which allows block updating of the loadings matrix. We propose an adaptive Gibbs sampler for automatically truncating the infinite loadings matrix through selection of the number of important factors. Theoretical results are provided on the support of the prior and truncation approximation bounds. A fast algorithm is proposed to produce approximate Bayes estimates. Latent factor regression methods are developed for prediction and variable selection in applications with highdimensional correlated predictors. Operating characteristics are assessed through simulation studies and the approach is applied to predict survival after chemotherapy from gene expression data.
An EMlike algorithm for semi and nonparametric estimation in multivariate mixtures
, 2008
"... ..."
Transposable Regularized Covariance Models with an Application to Missing Data Imputation
, 2008
"... Missing data estimation is an important challenge with highdimensional data arranged in the form of a matrix. Typically this data is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrixvariate no ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Missing data estimation is an important challenge with highdimensional data arranged in the form of a matrix. Typically this data is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrixvariate normal, the meanrestricted matrixvariate normal, in which the rows and columns each have a separate mean vector and covariance matrix. We extend regularized covariance models, which place an additive penalty on the inverse covariance matrix, to this distribution, by placing separate penalties on the covariances of the rows and columns. These so called transposable regularized covariance models allow for maximum likelihood estimation of the mean and nonsingular covariance matrices. Using these models, we formulate EMtype algorithms for missing data imputation in both the multivariate and transposable frameworks. Exploiting the structure of our transposable models, we present techniques enabling use of our models with highdimensional data and give a computationally feasible onestep approximation for imputation. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility. 1
Group Anomaly Detection using Flexible Genre Models
"... An important task in exploring and analyzing realworld data sets is to detect unusual and interesting phenomena. In this paper, we study the group anomaly detection problem. Unlike traditional anomaly detection research that focuses on data points, our goal is to discover anomalous aggregated behav ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
An important task in exploring and analyzing realworld data sets is to detect unusual and interesting phenomena. In this paper, we study the group anomaly detection problem. Unlike traditional anomaly detection research that focuses on data points, our goal is to discover anomalous aggregated behaviors of groups of points. For this purpose, we propose the Flexible Genre Model (FGM). FGM is designed to characterize data groups at both the point level and the group level so as to detect various types of group anomalies. We evaluate the effectiveness of FGM on both synthetic and real data sets including images and turbulence data, and show that it is superior to existing approaches in detecting group anomalies. 1
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 Journal of the American Statistical Association
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", \Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Strategies for getting . . .
, 2001
"... We compare simple strategies to get maximum likelihood parameter estimation in mixture models when using the EM algorithm. All considered strategies are aiming to initiate the EM algorithm in a good way. They are based on random initialisation, using a Classication EM algorithm (CEM), a Stochastic E ..."
Abstract
 Add to MetaCart
We compare simple strategies to get maximum likelihood parameter estimation in mixture models when using the EM algorithm. All considered strategies are aiming to initiate the EM algorithm in a good way. They are based on random initialisation, using a Classication EM algorithm (CEM), a Stochastic EM algorithm (SEM) or previous short runs of EM itself. They are compared in the context of multivariate Gaussian mixtures on the basis of numerical experiments on both simulated and real data sets. The main conclusions of those numerical experiments are the following. The simple random initialisation which is probably the most employed way of initiating EM is often outperformed by strategies using CEM, SEM or shorts runs of EM before running EM. Thus, those strategies can be preferred to the random initialisation strategy. Also, it appears that repeating runs of EM is generally protable since using a single run of EM can often lead to suboptimal solutions. Otherwise, none of the experimented strategies can be regarded as the best one and it is dicult to characterize situations where a particular strategy can be expected to outperform the other ones. However, the strategy initiating EM with repeated short runs of EM can be recommended. This strategy, which as far as we know was not used before the present study have some advantages. It is simple, performs well in a lot of situations presupposing no particular form of the mixture to be tted to the data and seems little sensitive to noisy data.