Results 1  10
of
18
Dealing with label switching in mixture models
 Journal of the Royal Statistical Society, Series B
, 2000
"... In a Bayesian analysis of finite mixture models, parameter estimation and clustering are sometimes less straightforward that might be expected. In particular, the common practice of estimating parameters by their posterior mean, and summarising joint posterior distributions by marginal distributions ..."
Abstract

Cited by 109 (0 self)
 Add to MetaCart
In a Bayesian analysis of finite mixture models, parameter estimation and clustering are sometimes less straightforward that might be expected. In particular, the common practice of estimating parameters by their posterior mean, and summarising joint posterior distributions by marginal distributions, often leads to nonsensical answers. This is due to the socalled “labelswitching” problem, which is caused by symmetry in the likelihood of the model parameters. A frequent response to this problem is to remove the symmetry using artificial identifiability constraints. We demonstrate that this fails in general to solve the problem, and describe an alternative class of approaches, relabelling algorithms, which arise from attempting to minimise the posterior expected loss under a class of loss functions. We describe in detail one particularly simple and general relabelling algorithm, and illustrate its success in dealing with the labelswitching problem on two examples.
Markov Chain Monte Carlo methods and the label switching problem in Bayesian mixture modelling
 Statistical Science
"... Abstract. In the past ten years there has been a dramatic increase of interest in the Bayesian analysis of finite mixture models. This is primarily because of the emergence of Markov chain Monte Carlo (MCMC) methods. While MCMC provides a convenient way to draw inference from complicated statistical ..."
Abstract

Cited by 51 (4 self)
 Add to MetaCart
Abstract. In the past ten years there has been a dramatic increase of interest in the Bayesian analysis of finite mixture models. This is primarily because of the emergence of Markov chain Monte Carlo (MCMC) methods. While MCMC provides a convenient way to draw inference from complicated statistical models, there are many, perhaps underappreciated, problems associated with the MCMC analysis of mixtures. The problems are mainly caused by the nonidentifiability of the components under symmetric priors, which leads to socalled label switching in the MCMC output. This means that ergodic averages of component specific quantities will be identical and thus useless for inference. We review the solutions to the label switching problem, such as artificial identifiability constraints, relabelling algorithms and label invariant loss functions. We also review various MCMC sampling schemes that have been suggested for mixture models and discuss posterior sensitivity to prior specification.
Bayesian Model Assessment and Comparison Using CrossValidation Predictive Densities
 Neural Computation
, 2002
"... In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimat ..."
Abstract

Cited by 26 (10 self)
 Add to MetaCart
In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimate, it is important to obtain the distribution of the expected utility estimate, as it describes the uncertainty in the estimate. The distributions of the expected utility estimates can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. We propose an approach using crossvalidation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions. We also discuss the probabilistic assumptions made and properties of two practical crossvalidation methods, importance sampling and kfold crossvalidation. As illustrative examples, we use MLP neural networks and Gaussian Processes (GP) with Markov chain Monte Carlo sampling in one toy problem and two challenging realworld problems.
Computing Normalizing Constants for Finite Mixture Models via Incremental Mixture Importance Sampling (IMIS)
, 2003
"... We propose a method for approximating integrated likelihoods in finite mixture models. We formulate the model in terms of the unobserved group memberships, z, and make them the variables of integration. The integral is then evaluated using importance sampling over the z. We propose an adaptive imp ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
We propose a method for approximating integrated likelihoods in finite mixture models. We formulate the model in terms of the unobserved group memberships, z, and make them the variables of integration. The integral is then evaluated using importance sampling over the z. We propose an adaptive importance sampling function which is itself a mixture, with two types of component distributions, one concentrated and one diffuse. The more concentrated type of component serves the usual purpose of an importance sampling function, sampling mostly group assignments of high posterior probability. The less concentrated type of component allows for the importance sampling function to explore the space in a controlled way to find other, unvisited assignments with high posterior probability. Components are added adaptively, one at a time, to cover areas of high posterior probability not well covered by the current important sampling function. The method is called Incremental Mixture Importance Sampling (IMIS). IMIS is easy to implement and to monitor for convergence. It scales easily for higher dimensional
On Bayesian model assessment and choice using crossvalidation predictive densities
, 2001
"... We consider the problem of estimating the distribution of the expected utility of the Bayesian model (expected utility is also known as generalization error). We use the crossvalidation predictive densities to compute the expected utilities. We demonstrate that in flexible nonlinear models having ..."
Abstract

Cited by 7 (7 self)
 Add to MetaCart
We consider the problem of estimating the distribution of the expected utility of the Bayesian model (expected utility is also known as generalization error). We use the crossvalidation predictive densities to compute the expected utilities. We demonstrate that in flexible nonlinear models having many parameters, the importance sampling approximated leaveoneout crossvalidation (ISLOOCV) proposed in (Gelfand et al., 1992) may not work. We discuss how the reliability of the importance sampling can be evaluated and in case there is reason to suspect the reliability of the importance sampling, we suggest to use predictive densities from the kfold crossvalidation (kfoldCV). We also note that the kfoldCV has to be used if data points have certain dependencies. As the kfoldCV predictive densities are based on slightly smaller data sets than the full data set, we use a bias correction proposed in (Burman, 1989) when computing the expected utilities. In order to assess the reliability of the estimated expected utilities, we suggest a quick and generic approach based on the Bayesian bootstrap for obtaining samples from the distributions of the expected utilities. Our main goal is to estimate how good (in terms of application field) the predictive ability of the model is, but the distributions of the expected utilities can also be used for comparing different models. With the proposed method, it is easy to compute the probability that one method has better expected utility than some other method. If the predictive likelihood is used as a utility (instead
Bayesian Input Variable Selection Using Posterior Probabilities and Expected Utilities
, 2002
"... We consider the input variable selection in complex Bayesian hierarchical models. Our goal is to find a model with the smallest number of input variables having statistically or practically at least the same expected utility as the full model with all the available inputs. A good estimate for the ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We consider the input variable selection in complex Bayesian hierarchical models. Our goal is to find a model with the smallest number of input variables having statistically or practically at least the same expected utility as the full model with all the available inputs. A good estimate for the expected utility can be computed using crossvalidation predictive densities. In the case of input selection and a large number of input combinations, the computation of the crossvalidation predictive densities for each model easily becomes computationally prohibitive. We propose to use the posterior probabilities obtained via variable dimension MCMC methods to find out potentially useful input combinations, for which the final model choice and assessment is done using the expected utilities.
Dealing With Multimodal Posteriors and NonIdentifiability in Mixture Models
, 1999
"... In a Bayesian analysis of finite mixture models, the lack of identifiability of the parameters often leads to a posterior distribution which is highly multimodal and symmetric, making it difficult to interpret or summarize. A common approach to this problem is to make the parameters identifiable by ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
In a Bayesian analysis of finite mixture models, the lack of identifiability of the parameters often leads to a posterior distribution which is highly multimodal and symmetric, making it difficult to interpret or summarize. A common approach to this problem is to make the parameters identifiable by imposing artificial constraints. We demonstrate that this may fail to solve the problem, and describe and illustrate an alternative solution which involves postprocessing the results of a Markov Chain Monte Carlo (MCMC) scheme. Our method can be viewed either as a method of searching for a reasonable summary of the posterior distribution, or as a method of revising the prior distribution. KEYWORDS: Bayesian, Classification, Clustering, Identifiability, MCMC, Mixture model, Multimodal posterior 1 Introduction In this paper we consider problems which arise when taking a Bayesian approach to classification and clustering using mixture models. We consider the setting where we have observation...
Modelbased Clustering of nonGaussian Panel Data
"... In this paper we propose a modelbased method to cluster units within a panel. The underlying model is autoregressive and nonGaussian, allowing for both skewness and fat tails, and the units are clustered according to their dynamic behaviour and equilibrium level. Inference is addressed from a Baye ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
In this paper we propose a modelbased method to cluster units within a panel. The underlying model is autoregressive and nonGaussian, allowing for both skewness and fat tails, and the units are clustered according to their dynamic behaviour and equilibrium level. Inference is addressed from a Bayesian perspective and model comparison is conducted using the formal tool of Bayes factors. Particular attention is paid to prior elicitation and posterior propriety. We suggest priors that require little subjective input from the user and possess hierarchical structures that enhance the robustness of the inference. Two examples illustrate the methodology: one analyses economic growth of OECD countries and the second one investigates employment growth of Spanish manufacturing firms.