Results 1  10
of
97
On Bayesian analysis of mixtures with an unknown number of components
 INSTITUTE OF INTERNATIONAL ECONOMICS PROJECT ON INTERNATIONAL COMPETITION POLICY," COM/DAFFE/CLP/TD(94)42
, 1997
"... ..."
How many clusters? Which clustering method? Answers via modelbased cluster analysis
 THE COMPUTER JOURNAL
, 1998
"... ..."
Unsupervised learning of finite mixture models
 IEEE Transactions on pattern analysis and machine intelligence
, 2002
"... AbstractÐThis paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization ..."
Abstract

Cited by 271 (20 self)
 Add to MetaCart
AbstractÐThis paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization (EM) algorithm, it does not require careful initialization. The proposed method also avoids another drawback of EM for mixture fitting: the possibility of convergence toward a singular estimate at the boundary of the parameter space. The novelty of our approach is that we do not use a model selection criterion to choose one among a set of preestimated candidate models; instead, we seamlessly integrate estimation and model selection in a single algorithm. Our technique can be applied to any type of parametric mixture model for which it is possible to write an EM algorithm; in this paper, we illustrate it with experiments involving Gaussian mixtures. These experiments testify for the good performance of our approach. Index TermsÐFinite mixtures, unsupervised learning, model selection, minimum message length criterion, Bayesian methods, expectationmaximization algorithm, clustering. æ 1
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 270 (24 self)
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Computational and Inferential Difficulties With Mixture Posterior Distributions
 Journal of the American Statistical Association
, 1999
"... This paper deals with both exploration and interpretation problems related to posterior distributions for mixture models. The specification of mixture posterior distributions means that the presence of k! modes is known immediately. Standard Markov chain Monte Carlo techniques usually have difficult ..."
Abstract

Cited by 112 (12 self)
 Add to MetaCart
This paper deals with both exploration and interpretation problems related to posterior distributions for mixture models. The specification of mixture posterior distributions means that the presence of k! modes is known immediately. Standard Markov chain Monte Carlo techniques usually have difficulties with wellseparated modes such as occur here; the Markov chain Monte Carlo sampler stays within a neighbourhood of a local mode and fails to visit other equally important modes. We show that exploration of these modes can be imposed on the Markov chain Monte Carlo sampler using tempered transitions based on Langevin algorithms. However, as the prior distribution does not distinguish between the different components, the posterior mixture distribution is symmetric and thus standard estimators such as posterior means cannot be used. Since this is also true for most nonsymmetric priors, we propose alternatives for Bayesian inference for permutation invariant posteriors, including a cluster...
Posterior consistency of Dirichlet mixtures in density estimation
 Ann. Statist
, 1999
"... A Dirichlet mixture of normal densities is a useful choice for a prior distribution on densities in the problem of Bayesian density estimation. In the recent years, efficient Markov chain Monte Carlo method for the computation of the posterior distribution has been developed. The method has been app ..."
Abstract

Cited by 66 (20 self)
 Add to MetaCart
A Dirichlet mixture of normal densities is a useful choice for a prior distribution on densities in the problem of Bayesian density estimation. In the recent years, efficient Markov chain Monte Carlo method for the computation of the posterior distribution has been developed. The method has been applied to data arising from different fields of interest. The important issue of consistency was however left open. In this paper, we settle this issue in affirmative. 1. Introduction. Recent
Dirichlet Prior Sieves in Finite Normal Mixtures
 Statistica Sinica
, 2002
"... Abstract: The use of a finite dimensional Dirichlet prior in the finite normal mixture model has the effect of acting like a Bayesian method of sieves. Posterior consistency is directly related to the dimension of the sieve and the choice of the Dirichlet parameters in the prior. We find that naive ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
Abstract: The use of a finite dimensional Dirichlet prior in the finite normal mixture model has the effect of acting like a Bayesian method of sieves. Posterior consistency is directly related to the dimension of the sieve and the choice of the Dirichlet parameters in the prior. We find that naive use of the popular uniform Dirichlet prior leads to an inconsistent posterior. However, a simple adjustment to the parameters in the prior induces a random probability measure that approximates the Dirichlet process and yields a posterior that is strongly consistent for the density and weakly consistent for the unknown mixing distribution. The dimension of the resulting sieve can be selected easily in practice and a simple and efficient Gibbs sampler can be used to sample the posterior of the mixing distribution. Key words and phrases: BoseEinstein distribution, Dirichlet process, identification, method of sieves, random probability measure, relative entropy, weak convergence.
Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities
 Ann. Statist
, 2001
"... We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or locationscale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is ..."
Abstract

Cited by 35 (10 self)
 Add to MetaCart
We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or locationscale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is also assumed to lie in this class with the true mixing distribution either compactly supported or having subGaussian tails. We obtain bounds for Hellinger bracketing entropies for this class, and from these bounds, we deduce the convergence rates of (sieve) MLEs in Hellinger distance. The rate turns out to be �log n � κ / √ n, where κ ≥ 1 is a constant that depends on the type of mixtures and the choice of the sieve. Next, we consider a Dirichlet mixture of normals as a prior on the unknown density. We estimate the prior probability of a certain KullbackLeibler type neighborhood and then invoke a general theorem that computes the posterior convergence rate in terms the growth rate of the Hellinger entropy and the concentration rate of the prior. The posterior distribution is also seen to converge at the rate �log n � κ / √ n in, where κ now depends on the tail behavior of the base measure of the Dirichlet process. 1. Introduction. A