Results 1  10
of
16
Estimating the integrated likelihood via posterior simulation using the harmonic mean identity
 Bayesian Statistics
, 2007
"... The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison a ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison and Bayesian testing is a ratio of integrated likelihoods, and the model weights in Bayesian model averaging are proportional to the integrated likelihoods. We consider the estimation of the integrated likelihood from posterior simulation output, aiming at a generic method that uses only the likelihoods from the posterior simulation iterations. The key is the harmonic mean identity, which says that the reciprocal of the integrated likelihood is equal to the posterior harmonic mean of the likelihood. The simplest estimator based on the identity is thus the harmonic mean of the likelihoods. While this is an unbiased and simulationconsistent estimator, its reciprocal can have infinite variance and so it is unstable in general. We describe two methods for stabilizing the harmonic mean estimator. In the first one, the parameter space is reduced in such a way that the modified estimator involves a harmonic mean of heaviertailed densities, thus resulting in a finite variance estimator. The resulting
Functional clustering by bayesian wavelet methods
 Journal of the Royal Statistical Society B
, 2006
"... Summary. We propose a nonparametric Bayes wavelet model for clustering of functional data. The waveletbased methodology is aimed at the resolution of generic global and local features during clustering and is suitable for clustering high dimensional data. Based on the Dirichlet process, the nonpara ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
Summary. We propose a nonparametric Bayes wavelet model for clustering of functional data. The waveletbased methodology is aimed at the resolution of generic global and local features during clustering and is suitable for clustering high dimensional data. Based on the Dirichlet process, the nonparametric Bayes model extends the scope of traditional Bayes wavelet methods to functional clustering and allows the elicitation of prior belief about the regularity of the functions and the number of clusters by suitably mixing the Dirichlet processes. Posterior inference is carried out by Gibbs sampling with conjugate priors, which makes the computation straightforward. We use simulated as well as real data sets to illustrate the suitability of the approach over other alternatives.
NONPARAMETRIC FUNCTIONAL DATA ANALYSIS THROUGH BAYESIAN DENSITY ESTIMATION
, 2007
"... In many modern experimental settings, observations are obtained in the form of functions, and interest focuses on inferences on a collection of such functions. Some examples are conductivitytemperaturedepth (CTD) data in oceanography, doseresponse models in epidemiology and timecourse microarray ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
In many modern experimental settings, observations are obtained in the form of functions, and interest focuses on inferences on a collection of such functions. Some examples are conductivitytemperaturedepth (CTD) data in oceanography, doseresponse models in epidemiology and timecourse microarray experiments in biology and medicine. In this paper we propose a hierarchical model that allows us to simultaneously estimate multiple curves nonparametrically by using dependent Dirichlet Process mixtures of Gaussians to characterize the joint distribution of predictors and outcomes. Function estimates are then induced through the conditional distribution of the outcome given the predictors. The resulting approach allows for flexible estimation and clustering, while borrowing information across curves. We also show that the function estimates we obtain are consistent on the space of integrable functions. As an illustration, we consider an application to the analysis of CTD data in the north Atlantic.
Nonparametric bayes conditional distribution modeling with variable selection
 Journal of the American Statistical Association
, 2009
"... This article considers methodology for flexibly characterizing the relationship between a response and multiple predictors. Goals are (1) to estimate the conditional response distribution addressing the distributional changes across the predictor space, and (2) to identify important predictors for t ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
This article considers methodology for flexibly characterizing the relationship between a response and multiple predictors. Goals are (1) to estimate the conditional response distribution addressing the distributional changes across the predictor space, and (2) to identify important predictors for the response distribution change both with local regions and globally. We first introduce the probit stickbreaking process (PSBP) as a prior for an uncountable collection of predictordependent random probability measures and propose a PSBP mixture (PSBPM) of normal regressions for modeling the conditional distributions. A global variable selection structure is incorporated to discard unimportant predictors, while allowing estimation of posterior inclusion probabilities. Local variable selection is conducted relying on the conditional distribution estimates at different predictor points. An efficient stochastic search sampling algorithm is proposed for posterior computation. The methods are illustrated through simulation and applied to an epidemiologic study.
Nonparametric Bayes applications to biostatistics,” Bayesian Nonparametrics: Principles and Practice
 In
, 2010
"... Biomedical research has clearly evolved at a dramatic rate in the past decade, with improvements in technology leading to a fundamental shift in the way in which data are collected and analyzed. Before this paradigm shift, studies were most commonly designed to be simple and to focus on relationship ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Biomedical research has clearly evolved at a dramatic rate in the past decade, with improvements in technology leading to a fundamental shift in the way in which data are collected and analyzed. Before this paradigm shift, studies were most commonly designed to be simple and to focus on relationships among a few variables of primary interest. For example, in
Fast Bayesian Inference in Dirichlet Process Mixture Models
, 2008
"... There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichle ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
Variable Selection in Nonparametric Random Effects Models
"... In analyzing longitudinal or clustered data with a mixed effects model (Laird and Ware, 1982), one may be concerned about violations of normality. Such violations can potentially impact subset selection for the fixed and random effects components of the model, inferences on the heterogeneity structu ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
In analyzing longitudinal or clustered data with a mixed effects model (Laird and Ware, 1982), one may be concerned about violations of normality. Such violations can potentially impact subset selection for the fixed and random effects components of the model, inferences on the heterogeneity structure, and the accuracy of predictions. This article focuses on Bayesian methods for subset selection in nonparametric random effects models in which one is uncertain about the predictors to be included and the distribution of their random effects. We characterize the unknown distribution of the individualspecific regression coefficients using a weighted sum of Dirichlet process (DP)distributed latent variables. By using carefullychosen mixture priors for coefficients in the base distributions of the component DPs, we allow fixed and random effects to be effectively dropped out of the model. A stochastic search Gibbs sampler is developed for posterior computation, and the methods are illustrated using simulated data and real data from a multilaboratory bioassay study.
Particle Learning for General Mixtures
"... This paper develops efficient sequential learning methods for the estimation of general mixture models. The approach is distinguished from alternative particle filtering methods in two major ways. First, each iteration begins by resampling particles according to posterior predictive probability, lea ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
This paper develops efficient sequential learning methods for the estimation of general mixture models. The approach is distinguished from alternative particle filtering methods in two major ways. First, each iteration begins by resampling particles according to posterior predictive probability, leading to a more efficient set for propagation. Second, each particle tracks only the state of sufficient information for latent mixture components, thus leading to reduced dimensional inference. In addition, we describe how the approach will apply to more general mixture models of current interest in the literature; it is hoped that this will inspire a greater number of researchers to adopt sequential Monte Carlo methods for fitting their sophisticated mixture based models. Finally, we show that this particle learning approach leads to straightforward tools for marginal likelihood calculation and posterior cluster allocation. Specific versions of the algorithm are derived for standard density estimation applications based on both finite mixture models and Dirichlet process mixture models, as well as for the less common settings of latent feature selection through an Indian Buffet process and dependent distribution tracking through a probit stickbreaking model. Three simulation examples are presented: density estimation and model selection for a finite mixture model; a simulation study for Dirichlet process density estimation with as many as 12500 observations of 25 dimensional data, and an example of nonparametric mixture regression that requires learning truncated approximations to the infinite random mixing distribution.
Analysis of Additive Instrumental Variable Models
, 2004
"... We apply Bayesian methods to a model involving a binary endogenous regressor and an instrumental variable in which the functional forms of some of the covariates in both the treatment assignment and outcome distributions are unknown. Continuous and binary response variables are considered. Under the ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We apply Bayesian methods to a model involving a binary endogenous regressor and an instrumental variable in which the functional forms of some of the covariates in both the treatment assignment and outcome distributions are unknown. Continuous and binary response variables are considered. Under the assumption that the functional form is additive in the covariates, we develop efficient Markov chain Monte Carlo based approaches for summarizing the posterior distribution and for comparing various alternative models via marginal likelihoods and Bayes factors. We show in a simulation experiment that the methods are capable of recovering the unknown functions and are sensitive neither to the sample size nor to the degree of endogeneity as measured by the correlation between the errors in the treatment and response equations. In the binary response case, however, estimation of the average treatment effect requires larger sample sizes, especially when the degree of endogeneity is high.
Journal of the Royal Statistical Society 1
"... Abstract. Our first focus is prediction of a categorical response variable using features that lie on a known manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with ..."
Abstract
 Add to MetaCart
Abstract. Our first focus is prediction of a categorical response variable using features that lie on a known manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von MisesFisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using an exact block Gibbs sampler. Next we develop Bayesian nonparametric methods for testing whether there is difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes Factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications.