Results 1  10
of
27
Bayesian Analysis of Mixture Models with an Unknown Number of Components  an alternative to reversible jump methods
, 1998
"... Richardson and Green (1997) present a method of performing a Bayesian analysis of data from a finite mixture distribution with an unknown number of components. Their method is a Markov Chain Monte Carlo (MCMC) approach, which makes use of the "reversible jump" methodology described by Green (1995). ..."
Abstract

Cited by 62 (0 self)
 Add to MetaCart
Richardson and Green (1997) present a method of performing a Bayesian analysis of data from a finite mixture distribution with an unknown number of components. Their method is a Markov Chain Monte Carlo (MCMC) approach, which makes use of the "reversible jump" methodology described by Green (1995). We describe an alternative MCMC method which views the parameters of the model as a (marked) point process, extending methods suggested by Ripley (1977) to create a Markov birthdeath process with an appropriate stationary distribution. Our method is easy to implement, even in the case of data in more than one dimension, and we illustrate it on both univariate and bivariate data. Keywords: Bayesian analysis, Birthdeath process, Markov process, MCMC, Mixture model, Model Choice, Reversible Jump, Spatial point process 1 Introduction Finite mixture models are typically used to model data where each observation is assumed to have arisen from one of k groups, each group being suitably modelle...
Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities
 Ann. Statist
, 2001
"... We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or locationscale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or locationscale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is also assumed to lie in this class with the true mixing distribution either compactly supported or having subGaussian tails. We obtain bounds for Hellinger bracketing entropies for this class, and from these bounds, we deduce the convergence rates of (sieve) MLEs in Hellinger distance. The rate turns out to be �log n � κ / √ n, where κ ≥ 1 is a constant that depends on the type of mixtures and the choice of the sieve. Next, we consider a Dirichlet mixture of normals as a prior on the unknown density. We estimate the prior probability of a certain KullbackLeibler type neighborhood and then invoke a general theorem that computes the posterior convergence rate in terms the growth rate of the Hellinger entropy and the concentration rate of the prior. The posterior distribution is also seen to converge at the rate �log n � κ / √ n in, where κ now depends on the tail behavior of the base measure of the Dirichlet process. 1. Introduction. A
An exploration of aspects of Bayesian multiple testing
 Journal of Statistical Planning and Inference
, 2005
"... There has been increased interest of late in the Bayesian approach to multiple testing (often called the multiple comparisons problem), motivated by the need to analyze DNA microarray data in which it is desired to learn which of potentially several thousand genes are activated by a particular stimu ..."
Abstract

Cited by 31 (6 self)
 Add to MetaCart
There has been increased interest of late in the Bayesian approach to multiple testing (often called the multiple comparisons problem), motivated by the need to analyze DNA microarray data in which it is desired to learn which of potentially several thousand genes are activated by a particular stimulus. We study the issue of prior specification for such multiple tests; computation of key posterior quantities; and useful ways to display these quantities. A decisiontheoretic approach is also considered.
Hypothesis Testing and Model Selection Via Posterior Simulation
 In Practical Markov Chain
, 1995
"... Introduction To motivate the methods described in this chapter, consider the following inference problem in astronomy (Soubiran, 1993). Until fairly recently, it has been believed that the Galaxy consists of two stellar populations, the disk and the halo. More recently, it has been hypothesized tha ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
Introduction To motivate the methods described in this chapter, consider the following inference problem in astronomy (Soubiran, 1993). Until fairly recently, it has been believed that the Galaxy consists of two stellar populations, the disk and the halo. More recently, it has been hypothesized that there are in fact three stellar populations, the old (or thin) disk, the thick disk, and the halo, distinguished by their spatial distributions, their velocities, and their metallicities. These hypotheses have different implications for theories of the formation of the Galaxy. Some of the evidence for deciding whether there are two or three populations is shown in Figure 1, which shows radial and rotational velocities for n = 2; 370 stars. A natural model for this situation is a mixture model with J components, namely y i = J X j=1 ae j
On Fitting Mixture Models
, 1999
"... Consider the problem of fitting a finite Gaussian mixture, with an unknown number of components, to observed data. This paper proposes a new minimum description length (MDL) type criterion, termed MMDL (for mixture MDL), to select the number of components of the model. MMDL is based on the ident ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
Consider the problem of fitting a finite Gaussian mixture, with an unknown number of components, to observed data. This paper proposes a new minimum description length (MDL) type criterion, termed MMDL (for mixture MDL), to select the number of components of the model. MMDL is based on the identification of an "equivalent sample size", for each component, which does not coincide with the full sample size. We also introduce an algorithm based on the standard expectationmaximization (EM) approach together with a new agglomerative step, called agglomerative EM (AEM). The experiments here reported have shown that MMDL outperforms existing criteria of comparable computational cost. The good behavior of AEM, namely its good robustness with respect to initialization, is also illustrated experimentally.
Rates Of Convergence For The Gaussian Mixture Sieve
 The Annals of Statistics
, 2000
"... Gaussian mixtures provide a convenient method of density estimation that lies somewhere between parametric models and kernel... ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
Gaussian mixtures provide a convenient method of density estimation that lies somewhere between parametric models and kernel...
Learning hybrid Bayesian networks from data
, 1998
"... We illustrate two different methodologies for learning Hybrid Bayesian networks, that is, Bayesian networks containing both continuous and discrete variables, from data. The two methodologies differ in the way of handling continuous data when learning the Bayesian network structure. The first method ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
We illustrate two different methodologies for learning Hybrid Bayesian networks, that is, Bayesian networks containing both continuous and discrete variables, from data. The two methodologies differ in the way of handling continuous data when learning the Bayesian network structure. The first methodology uses discretized data to learn the Bayesian network structure, and the original nondiscretized data for the parameterization of the learned structure. The second methodology uses nondiscretized data both to learn the Bayesian network structure and its parameterization. For the direct handling of continuous data, we propose the use of artificial neural networks as probability estimators, to be used as an integral part of the scoring metric defined to search the space of Bayesian network structures. With both methodologies, we assume the availability of a complete dataset, with no missing values or hidden variables. We report experimental results aimed at comparing the two methodologies. These results provide evidence that learning with discretized data presents advantages both in terms of efficiency and in terms of accuracy of the learned models over the alternative approach of using nondiscretized data.
Bayesian Time Series Classification
 Advances in Neural Processing Systems 14
, 2002
"... This paper proposes an approach to classification of adjacent segments of a time series as being either of K classes. We use a hierarchical model that consists of a feature extraction stage and a generative classifier which is built on top of these features. Such two stage approaches are often us ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
This paper proposes an approach to classification of adjacent segments of a time series as being either of K classes. We use a hierarchical model that consists of a feature extraction stage and a generative classifier which is built on top of these features. Such two stage approaches are often used in signal and image processing. The novel part of our work is that we link these stages probabilistically by using a latent feature space. To use one joint model is a Bayesian requirement, which has the advantage to fuse information according to its certainty.
Bayesian Computational Approaches to Model Selection
, 2000
"... this paper was to provide a summary of the stateof theart theory on Bayesian model selection and the application of MCMC algorithms. It has been shown how applications of considerable complexity can be handled successfully within this framework. Several methods for dealing with the use of default, ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
this paper was to provide a summary of the stateof theart theory on Bayesian model selection and the application of MCMC algorithms. It has been shown how applications of considerable complexity can be handled successfully within this framework. Several methods for dealing with the use of default, improper priors in the Bayesian model selection 506 Andrieu, Doucet et al. framework has been shown. Special care has been taken to pinpoint the subtleties of jumping from one parameter space to another, and in general, to show the construction of MCMC samplers in such scenarios. The focus in the paper was on the reversible jump MCMC algorithm as this is the most widely used of all existing methods; it is easy to use, flexible and has nice properties. Many references have been cited, with the emphasis being given to articles with signal processing applications. A Notation