Results 1 - 10
of
15
Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Biometrika
, 1995
"... This article proposes a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of differing dimensionality, which is flexible and entirely constructive. It should therefore have wide applicability in model determination problems. The methodology ..."
Abstract
-
Cited by 578 (18 self)
- Add to MetaCart
This article proposes a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of differing dimensionality, which is flexible and entirely constructive. It should therefore have wide applicability in model determination problems. The methodology is illustrated with applications to multiple change-point analysis in one and two dimensions, and to a Bayesian comparison of binomial experiments. Some key words: Change-point analysis, Image segmentation, Jump diffusion, Markov chain Monte Carlo, Multiple binomial experiments, Multiple shrinkage, Step function, Voronoi tessellation. 1 Introduction
Modelling heterogeneity with and without the Dirichlet process
, 2001
"... We investigate the relationships between Dirichlet process (DP) based models and allocation models for a variable number of components, based on exchangeable distributions. It is shown that the DP partition distribution is a limiting case of a Dirichlet± multinomial allocation model. Comparisons of ..."
Abstract
-
Cited by 49 (3 self)
- Add to MetaCart
We investigate the relationships between Dirichlet process (DP) based models and allocation models for a variable number of components, based on exchangeable distributions. It is shown that the DP partition distribution is a limiting case of a Dirichlet± multinomial allocation model. Comparisons of posterior performance of DP and allocation models are made in the Bayesian paradigm and illustrated in the context of univariate mixture models. It is shown in particular that the unbalancedness of the allocation distribution, present in the prior DP model, persists a posteriori. Exploiting the model connections, a new MCMC sampler for general DP based models is introduced, which uses split/merge moves in a reversible jump framework. Performance of this new sampler relative to that of some traditional samplers for DP processes is then explored.
A Bayesian Model for Collaborative Filtering
, 2002
"... Consider the general setup where a set of items have been partially rated by a set of judges, in the sense that not every item has been rated by every judge. For this setup, we propose a Bayesian approach for the problem of predicting the missing ratings from the observed ratings. This approach inco ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Consider the general setup where a set of items have been partially rated by a set of judges, in the sense that not every item has been rated by every judge. For this setup, we propose a Bayesian approach for the problem of predicting the missing ratings from the observed ratings. This approach incorporates similarity by assuming the set of judges can be partitioned into groups which share the same ratings probability distribution. This leads to a predictive distribution of missing ratings based on the posterior distribution of the groupings and associated ratings probabilities. Markov chain Monte Carlo methods and a hybrid search algorithm are then used to obtain predictions of the missing ratings. 1
Bayesian Tests And Model Diagnostics In Conditionally Independent Hierarchical Models
- Journal of the American Statistical Association
, 1994
"... Consider the conditionally independent hierarchical model (CIHM) where observations y i are independently distributed from f(y i j` i ), the parameters ` i are independently distributed from distributions g(`j), and the hyperparameters are distributed according to a distribution h(). The posterior ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Consider the conditionally independent hierarchical model (CIHM) where observations y i are independently distributed from f(y i j` i ), the parameters ` i are independently distributed from distributions g(`j), and the hyperparameters are distributed according to a distribution h(). The posterior distribution of all parameters of the CIHM can be efficiently simulated by Monte Carlo Markov Chain (MCMC) algorithms. Although these simulation algorithms have facilitated the application of CIHM's, they generally have not addressed the problem of computing quantities useful in model selection. This paper explores how MCMC simulation algorithms and other related computational algorithms can be used to compute Bayes factors that are useful in criticizing a particular CIHM. In the case where the CIHM models a belief that the parameters are exchangeable or lie on a regression surface, the Bayes factor can measure the consistency of the data with the structural prior belief. Bayes factors can ...
Multivariate mixtures of normals with unknown number of components
- Statist. Comp
, 2006
"... We present full Bayesian analysis of finite mixtures of multivariate normals with unknown number of components. We adopt reversible jump Markov chain Monte Carlo and we construct, in a manner similar to that of Richardson and Green (1997), split and merge moves that produce good mixing of the Markov ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We present full Bayesian analysis of finite mixtures of multivariate normals with unknown number of components. We adopt reversible jump Markov chain Monte Carlo and we construct, in a manner similar to that of Richardson and Green (1997), split and merge moves that produce good mixing of the Markov chains. The split moves are constructed on the space of eigenvectors and eigenvalues of the current covariance matrix so that the proposed covariance matrices are positive definite. Our proposed methodology has applications in classification and discrimination as well as heterogeneity modelling. We test our algorithm with real and simulated data.
A note on the Dirichlet process prior in Bayesian nonparametric inference with partial exchangeability
- Statist. Prob. Letters
, 1997
"... We consider Bayesian nonparametric inference for continuous-valued partially exchangeable data, when the partition of the observations into groups is unknown. This includes change-point problems and mixture models. As the prior, we consider a mixture of products of Dirichlet processes. We show that ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We consider Bayesian nonparametric inference for continuous-valued partially exchangeable data, when the partition of the observations into groups is unknown. This includes change-point problems and mixture models. As the prior, we consider a mixture of products of Dirichlet processes. We show that the discreteness of the Dirichlet process can have a large effect on inference (posterior distributions and Bayes factors), leading to conclusions that can be different from those that result from a reasonable parametric model. When the observed data are all distinct, the effect of the prior on the posterior is to favor more evenly balanced partitions, and its effect on Bayes factors is to favor more groups. In a hierarchical model with a Dirichlet process as the second-stage prior, the prior can also have a large effect on inference, but in the opposite direction, towards more unbalanced partitions. (~) 1997 Elsevier Science B.V.
Bayesian analysis of extreme values by mixture modeling
- Extremes
, 2003
"... Modeling of extreme values in the presence of heterogeneity is still a relatively unexplored area. We consider losses pertaining to several related categories. For each category, we view exceedances over a given threshold as generated by a Poisson process whose intensity is regulated by a specific l ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Modeling of extreme values in the presence of heterogeneity is still a relatively unexplored area. We consider losses pertaining to several related categories. For each category, we view exceedances over a given threshold as generated by a Poisson process whose intensity is regulated by a specific location, shape and scale parameter. Using a Bayesian approach, we develop a hierarchical mixture prior, with an unknown number of components, for each of the above parameters. Computations are performed using Reversible Jump MCMC. Our model accounts for possible grouping effects and takes advantage of the similarity across categories, both for estimation and prediction purposes. Some guidance on the specification of the prior distribution is provided, together with an assessment of inferential robustness. The method is illustrated throughout using a data set on large claims against a well-known insurance company over a 15-year period.
Bayesian Analysis of Factorial Experiments By Mixture Modelling
, 2000
"... this paper we try our hands at it. One version of the classical theory of factorial experiments, going back to Fisher and further developed by Kempthorne (1955), completely avoids distributional assumptions, assuming only additivity, and uses randomisation to derive the standard tests of hypotheses ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
this paper we try our hands at it. One version of the classical theory of factorial experiments, going back to Fisher and further developed by Kempthorne (1955), completely avoids distributional assumptions, assuming only additivity, and uses randomisation to derive the standard tests of hypotheses about treatment effects. Here, we are interested in the more familiar classical approach via linear modelling and normal distribution theory. The corresponding Bayesian analysis has been developed mainly in the pioneering works of Box & Tiao (1973) and Lindley & Smith (1972). Box & Tiao (1973, Chapter 6) discuss Bayesian analysis of cross classified designs, including fixed, random and mixed effects models. They point out that in a Bayesian approach the appropriate inference procedure for fixed and random effects "depends upon the nature of the prior distribution used to represent the behavior of the factors". They also show (Chapter 7) that shrinkage estimates of specific effects may result when a random effects model is assumed. Lindley & Smith (1972) use a hierarchically structured linear model built on multivariate normal components (special cases of the model are considered by Lindley, 1972 and Smith, 1973), with the focus on estimation of treatment effects. These are authoritative and attractive approaches, albeit with modest compromises to the Bayesian paradigm -- in respect of the estimation of the variance components -- necessitated by the computational limitations of the time. Nevertheless, the inference is almost entirely estimative: questions about the indistinguishability of factor levels, or more general hypotheses about contrasts, are answered indirectly trough their joint posterior distribution, e.g. by checking whether the hypothesis falls in a highest poster...
Relaxing the Local Independence Assumption for Quantitative Learning in Acyclic Directed Graphical Models through Hierarchical Partition Models
- Proceedings of Artificial Intelligence and Statistics ’99
, 1999
"... The simplest method proposed by Spiegelhalter and Lauritzen (1990) to perform quantitative learning in ADG presents a potential weakness: the local independence assumption. We propose to alleviate this problem through the use of Hierarchical Partition Models. Our approach is compared with the previo ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The simplest method proposed by Spiegelhalter and Lauritzen (1990) to perform quantitative learning in ADG presents a potential weakness: the local independence assumption. We propose to alleviate this problem through the use of Hierarchical Partition Models. Our approach is compared with the previous one from an interpretative and predictive point of view. 1 INTRODUCTION Spiegelhalter and Lauritzen (1990) (S-L) proposed a Bayesian model for Acyclic Directed Graphical Models (ADG) (also known as Bayesian Networks) that has become somewhat standard in the burgeoning literature on learning discrete graphical models. The basic idea is to treat the conditional probabilities of the random variables at each vertex in the graph as unknowns and associate a prior distribution on each one (the conditioning in each case is on the random variables associated with the parent vertices in the graph). The simplest approach of S-L introduces strong assumptions on the unknown conditional probabilities ...
Clustering Using Objective Functions and Stochastic Search
, 2007
"... Summary. A new approach to clustering multivariate data, based on a multilevel linear mixed model, is proposed. A key feature of the model is that observations from the same cluster are correlated, because they share cluster-specific random effects. The inclusion of cluster-specific random effects a ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Summary. A new approach to clustering multivariate data, based on a multilevel linear mixed model, is proposed. A key feature of the model is that observations from the same cluster are correlated, because they share cluster-specific random effects. The inclusion of cluster-specific random effects allows parsimonious departure from an assumed base model for cluster mean profiles. This departure is captured statistically via the posterior expectation, or best linear unbiased predictor. One of the parameters in the model is the true underlying partition of the data, and the posterior distribution of this parameter, which is known up to a normalizing constant, is used to cluster the data. The problem of finding partitions with high posterior probability is not amenable to deterministic methods such as the EM algorithm. Thus, we propose a stochastic search algorithm that is driven by a Markov chain that is a mixture of two Metropolis–Hastings algorithms—one that makes small scale changes to individual objects and another that performs large scale moves involving entire clusters. The methodology proposed is fundamentally different from the well-known finite mixture model approach to clustering, which does not explicitly include the partition as a parameter, and involves an independent and identically distributed structure.

