Results 1 - 10
of
12
On Bayesian analysis of mixtures with an unknown number of components
- INSTITUTE OF INTERNATIONAL ECONOMICS PROJECT ON INTERNATIONAL COMPETITION POLICY," COM/DAFFE/CLP/TD(94)42
, 1997
"... ..."
Model-Based Clustering, Discriminant Analysis, and Density Estimation
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract
-
Cited by 172 (23 self)
- Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for model-based clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Dealing with label switching in mixture models
- Journal of the Royal Statistical Society, Series B
, 2000
"... In a Bayesian analysis of finite mixture models, parameter estimation and clustering are sometimes less straightforward that might be expected. In particular, the common practice of estimating parameters by their posterior mean, and summarising joint posterior distributions by marginal distributions ..."
Abstract
-
Cited by 72 (0 self)
- Add to MetaCart
In a Bayesian analysis of finite mixture models, parameter estimation and clustering are sometimes less straightforward that might be expected. In particular, the common practice of estimating parameters by their posterior mean, and summarising joint posterior distributions by marginal distributions, often leads to nonsensical answers. This is due to the so-called “labelswitching” problem, which is caused by symmetry in the likelihood of the model parameters. A frequent response to this problem is to remove the symmetry using artificial identifiability constraints. We demonstrate that this fails in general to solve the problem, and describe an alternative class of approaches, relabelling algorithms, which arise from attempting to minimise the posterior expected loss under a class of loss functions. We describe in detail one particularly simple and general relabelling algorithm, and illustrate its success in dealing with the labelswitching problem on two examples.
Accounting for Burstiness in Topic Models
"... Many different topic models have been used successfully for a variety of applications. However, even state-of-the-art topic models suffer from the important flaw that they do not capture the tendency of words to appear in bursts; it is a fundamental property of language that if a word is used once i ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Many different topic models have been used successfully for a variety of applications. However, even state-of-the-art topic models suffer from the important flaw that they do not capture the tendency of words to appear in bursts; it is a fundamental property of language that if a word is used once in a document, it is more likely to be used again. We introduce a topic model that uses Dirichlet compound multinomial (DCM) distributions to model this burstiness phenomenon. On both text and non-text datasets, the new model achieves better held-out likelihood than standard latent Dirichlet allocation (LDA). It is straightforward to incorporate the DCM extension into topic models that are more complex than LDA. 1.
Sparse Bayesian infinite factor models
"... We focus on sparse modeling of high-dimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk toward zero as the col ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
We focus on sparse modeling of high-dimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk toward zero as the column index increases. We use our prior on a parameter expanded loadings matrix to avoid the order dependence typical in factor analysis models and develop a highly efficient Gibbs sampler that scales well as data dimensionality increases. The gain in efficiency is achieved by the joint conjugacy property of the proposed prior, which allows block updating of the loadings matrix. We propose an adaptive Gibbs sampler for automatically truncating the infinite loadings matrix through selection of the number of important factors. Theoretical results are provided on the support of the prior and truncation approximation bounds. A fast algorithm is proposed to produce approximate Bayes estimates. Latent factor regression methods are developed for prediction and variable selection in applications with high-dimensional correlated predictors. Operating characteristics are assessed through simulation studies and the approach is applied to predict survival after chemotherapy from gene expression data.
An EM-like algorithm for semi- and non-parametric estimation in multivariate mixtures
, 2008
"... ..."
Transposable Regularized Covariance Models with an Application to Missing Data Imputation
, 2008
"... Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate no ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal, in which the rows and columns each have a separate mean vector and covariance matrix. We extend regularized covariance models, which place an additive penalty on the inverse covariance matrix, to this distribution, by placing separate penalties on the covariances of the rows and columns. These so called transposable regularized covariance models allow for maximum likelihood estimation of the mean and non-singular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. Exploiting the structure of our transposable models, we present techniques enabling use of our models with high-dimensional data and give a computationally feasible one-step approximation for imputation. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility. 1
Model-Based Clustering, Discriminant Analysis, and Density Estimation
- Journal of the American Statistical Association
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract
- Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", \Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for model-based clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Strategies for getting . . .
, 2001
"... We compare simple strategies to get maximum likelihood parameter estimation in mixture models when using the EM algorithm. All considered strategies are aiming to initiate the EM algorithm in a good way. They are based on random initialisation, using a Classication EM algorithm (CEM), a Stochastic E ..."
Abstract
- Add to MetaCart
We compare simple strategies to get maximum likelihood parameter estimation in mixture models when using the EM algorithm. All considered strategies are aiming to initiate the EM algorithm in a good way. They are based on random initialisation, using a Classication EM algorithm (CEM), a Stochastic EM algorithm (SEM) or previous short runs of EM itself. They are compared in the context of multivariate Gaussian mixtures on the basis of numerical experiments on both simulated and real data sets. The main conclusions of those numerical experiments are the following. The simple random initialisation which is probably the most employed way of initiating EM is often outperformed by strategies using CEM, SEM or shorts runs of EM before running EM. Thus, those strategies can be preferred to the random initialisation strategy. Also, it appears that repeating runs of EM is generally protable since using a single run of EM can often lead to suboptimal solutions. Otherwise, none of the experimented strategies can be regarded as the best one and it is dicult to characterize situations where a particular strategy can be expected to outperform the other ones. However, the strategy initiating EM with repeated short runs of EM can be recommended. This strategy, which as far as we know was not used before the present study have some advantages. It is simple, performs well in a lot of situations presupposing no particular form of the mixture to be tted to the data and seems little sensitive to noisy data.
Group Anomaly Detection using Flexible Genre Models
"... An important task in exploring and analyzing real-world data sets is to detect unusual and interesting phenomena. In this paper, we study the group anomaly detection problem. Unlike traditional anomaly detection research that focuses on data points, our goal is to discover anomalous aggregated behav ..."
Abstract
- Add to MetaCart
An important task in exploring and analyzing real-world data sets is to detect unusual and interesting phenomena. In this paper, we study the group anomaly detection problem. Unlike traditional anomaly detection research that focuses on data points, our goal is to discover anomalous aggregated behaviors of groups of points. For this purpose, we propose the Flexible Genre Model (FGM). FGM is designed to characterize data groups at both the point level and the group level so as to detect various types of group anomalies. We evaluate the effectiveness of FGM on both synthetic and real data sets including images and turbulence data, and show that it is superior to existing approaches in detecting group anomalies. 1

