Results 1 - 10
of
62
Hidden Markov models and disease mapping
- Journal of the American Statistical Association
, 2001
"... We present new methodology to extend Hidden Markov models to the spatial domain, and use this class of models to analyse spatial heterogeneity of count data on a rare phenomenon. This situation occurs commonly in many domains of application, particularly in disease mapping. We assume that the counts ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
We present new methodology to extend Hidden Markov models to the spatial domain, and use this class of models to analyse spatial heterogeneity of count data on a rare phenomenon. This situation occurs commonly in many domains of application, particularly in disease mapping. We assume that the counts follow a Poisson model at the lowest level of the hierarchy, and introduce a finite mixture model for the Poisson rates at the next level. The novelty lies in the model for allocation to the mixture components, which follows a spatially correlated process, the Potts model, and in treating the number of components of the spatial mixture as unknown. Inference is performed in a Bayesian framework using reversible jump MCMC. The model introduced can be viewed as a Bayesian semiparametric approach to specifying exible spatial distribution in hierarchical models. Performance of the model and comparison with an alternative well-known Markov random field specification for the Poisson rates are demonstrated on synthetic data sets. We show that our allocation model avoids the problem of oversmoothing in cases where the underlying rates exhibit discontinuities, while giving equally good results in cases of smooth gradient-like or highly autocorrelated rates. The methodology is illustrated on an epidemiological application to data on a rare cancer in France.
Bayesian mixed membership models for soft clustering and classification
- Classification—The Ubiquitous Challenge
, 2005
"... work was presented in a plenary lecture at the 28th Annual Conference of the German Classification Society, Dortmund. We are indebted to John Lafferty for his collaboration on the analysis of the PNAS data which we report here, and to Adrian Raftery and Christian Robert for helpful discussions on se ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
work was presented in a plenary lecture at the 28th Annual Conference of the German Classification Society, Dortmund. We are indebted to John Lafferty for his collaboration on the analysis of the PNAS data which we report here, and to Adrian Raftery and Christian Robert for helpful discussions on selecting K. Erosheva’s work was supported by NIH grants 1 RO1 AG023141-01 and R01 CA94212-01, Fienberg’s work was supported by NIH grant 1 RO1 AG023141-01 and by the Centre de Recherche en Economie et Statistique of the Institut National de la Statistique et
Bias-Corrected Bootstrap and Model Uncertainty
"... The bootstrap has become a popular method for exploring model (structure) uncertainty. Our experiments with artificial and realworld data demonstrate that the graphs learned from bootstrap samples can be severely biased towards too complex graphical models. ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
The bootstrap has become a popular method for exploring model (structure) uncertainty. Our experiments with artificial and realworld data demonstrate that the graphs learned from bootstrap samples can be severely biased towards too complex graphical models.
Generalized structured additive regression based on Bayesian P-splines
- Comput. Statist. Data Anal
, 2006
"... Generalized additive models (GAM) for modeling nonlinear effects of continuous covariates are now well established tools for the applied statistician. A Bayesian version of GAM’s and extensions to generalized structured additive regression (STAR) are developed. One or two dimensional P-splines are u ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Generalized additive models (GAM) for modeling nonlinear effects of continuous covariates are now well established tools for the applied statistician. A Bayesian version of GAM’s and extensions to generalized structured additive regression (STAR) are developed. One or two dimensional P-splines are used as the main building block. Inference relies on Markov chain Monte Carlo (MCMC) simulation techniques, and is either based on iteratively weighted least squares (IWLS) proposals or on latent utility representations of (multi)categorical regression models. The approach covers the most common univariate response distributions, e.g. the binomial, Poisson or gamma distribution, as well as multicategorical responses. For the first time, Bayesian semiparametric inference for the widely used multinomial logit model is presented. Two applications on the forest health status of trees and a space-time analysis of health insurance data demonstrate the potential of the approach for realistic modeling of complex problems. Software for the methodology is provided within the public domain package BayesX. Key words: geoadditive models, IWLS proposals, multicategorical response, structured additive predictors, surface smoothing
Fitting Genetic Models Using Markov Chain Monte Carlo Algorithms with Bugs." Twin Research and Human Genetics 9
, 2006
"... Maximum likelihood estimation techniques are widely used in twin and family studies, but soon reach computational boundaries when applied to highly complex models (e.g., models including geneby-environment interaction and gene–environment correlation, item response theory measurement models, repeate ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Maximum likelihood estimation techniques are widely used in twin and family studies, but soon reach computational boundaries when applied to highly complex models (e.g., models including geneby-environment interaction and gene–environment correlation, item response theory measurement models, repeated measures, longitudinal structures, extended pedigrees). Markov Chain Monte Carlo (MCMC) algorithms are very well suited to fit complex models with hierarchically structured data. This article introduces the key concepts of Bayesian inference and MCMC parameter estimation and provides a number of scripts describing relatively simple models to be estimated by the freely obtainable BUGS software. In addition, inference using BUGS is illustrated using a
Modeling transport mode decisions using hierarchical binary spatial regression models with cluster effects. Statistical Modelling
, 2007
"... This work is motivated by a mobility study conducted in the city of Munich, Germany. The variable of interest is a binary response, which indicates whether public transport has been utilized or not. One of the central questions is to identify areas of low/high utilization of public transport after a ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
This work is motivated by a mobility study conducted in the city of Munich, Germany. The variable of interest is a binary response, which indicates whether public transport has been utilized or not. One of the central questions is to identify areas of low/high utilization of public transport after adjusting for explanatory factors such as trip, individual and household attributes. For the spatial effects a modification of a class of Markov Random Fields (MRF) models with proper joint distributions introduced by Pettitt et al. (2002) is developed. It contains the intrinsic MRF in the limit and allows for efficient Markov Chain Monte Carlo (MCMC) algorithms. Further cluster effects using group and individual approaches are taken into consideration. The first one models heterogeneity between clusters, while the second one models heterogeneity within clusters. A naive approach to include individual cluster effects results in an unidentifiable model. It is shown how a re-parametrization gives identifiable parameters. This provides a new approach for modeling heterogeneity within clusters. Finally the proposed model classes are applied to the mobility study. Key words: binary regression, spatial effects, group and individual cluster effects, MCMC, transport mode decisions
Geological Survey
- Options for USGS and NSF Programs, Menlo
, 1976
"... Any use of trade, product, or firm names in this publication is for descriptive purposes only and does not imply endorsement by the U.S. Government. Although this report is in the public domain, permission must be secured from the individual copyright owners to reproduce any copyrighted materials co ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Any use of trade, product, or firm names in this publication is for descriptive purposes only and does not imply endorsement by the U.S. Government. Although this report is in the public domain, permission must be secured from the individual copyright owners to reproduce any copyrighted materials contained within this report. Suggested citation: Moran, M.J., 2006, Occurrence and implications of selected chlorinated solvents in ground water and source water in the United States and in drinking water in 12 Northeast and Mid-Atlantic States, 1993–2002: U.S. Geological Survey Scientific Investigations Report 2005–5268, 70 p.iii Foreword The U.S. Geological Survey (USGS) is committed to serve the Nation with accurate and timely scientific information that helps enhance and protect the overall quality of life, and facilitates effective management of water, biological, energy, and mineral resources. Information on the quality of the Nation’s water resources is of critical interest to the USGS because it is so integrally linked to the long-term availability of water that is clean and safe for drinking and recreation and that is suitable for industry, irrigation, and habitat for fish and wildlife. Escalating population growth and
Issues in claims reserving and credibility: a semiparametric Approach with Mixed Models
, 2006
"... Verrall (1996) and England & Verrall (2001) first considered the use of smoothing methods in the context of claims reserving. They applied two smoothing procedures in a likelihood-based way, namely the locally weighted regression smoother (‘loess’) and the cubic smoothing spline smoother. Using the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Verrall (1996) and England & Verrall (2001) first considered the use of smoothing methods in the context of claims reserving. They applied two smoothing procedures in a likelihood-based way, namely the locally weighted regression smoother (‘loess’) and the cubic smoothing spline smoother. Using the statistical methodology of semiparametric regression and its connection with mixed models (see e.g. Ruppert et al., 2003), this paper revisits smoothing models for loss reserving and credibil-ity. Apart from the flexibility inherent to all semiparametric methods, advantages of the semiparametric approach developed here are threefold. Firstly, a Bayesian implementation of these smoothing models is relatively straightforward and allows simulation from the full predictive distribution of quantities of interest. Since the main interest of actuaries lies in prediction, this is a major advantage. Secondly, because the constructed models have an interpretation as (generalized) linear mixed models ((G)LMMs), standard statistical theory and software for (G)LMMs can be used. Thirdly, more complicated data sets, dealing for example with quarterly de-velopment in a reserving context, heavy-tails, semicontinuous data, or extensive longitudinal data, can be modelled within this framework. Throughout this article, data examples illustrate these different aspects. Several comments are included re-garding model specification, estimation and selection.
Bayes Estimate and Inference for Entropy and Information Index of Fit
"... Kullback-Leibler information is widely used for developing indices of distributional fit. The most celebrated of such indices is Akaike’s AIC, which is derived as an estimate of the minimum Kullback-Leibler information between the unknown data-generating distribution and a parametric model. In the d ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Kullback-Leibler information is widely used for developing indices of distributional fit. The most celebrated of such indices is Akaike’s AIC, which is derived as an estimate of the minimum Kullback-Leibler information between the unknown data-generating distribution and a parametric model. In the derivation of AIC, the entropy of the data-generating distribution is bypassed because it is free from the parameters. Consequently, the AIC type measures provide criteria for model comparison purposes only, and do not provide information diagnostic about the model fit. A nonparametric estimate of entropy of the data-generating distribution is needed for assessing the model fit. Several entropy estimates are available and have been used for frequentist inference about information fit indices. A few entropy-based fit indices have been suggested for Bayesian inference. This paper develops a class of entropy estimates and provides a procedure for Bayesian inference on the entropy and a fit index. For the continuous case, we define a quantized entropy that approximates and converges to the entropy integral. The quantized entropy includes some well known measures of sample entropy and the existing Bayes entropy estimates as its special cases. For inference about the fit, we use the candidate model as the expected distribution in the Dirichlet process prior and derive the posterior mean of the quantized entropy as the Bayes estimate. The maximum entropy characterization of the candidate model is then used to derive the prior and posterior distributions for the Kullback-Leibler information index of fit. The consistency of the proposed Bayes estimates for the entropy and for the information index are shown. As by-products, the procedure also produces priors and posteriors for the model parameters and the moments.
Variable Selection in Nonparametric Random Effects Models
"... In analyzing longitudinal or clustered data with a mixed effects model (Laird and Ware, 1982), one may be concerned about violations of normality. Such violations can potentially impact subset selection for the fixed and random effects components of the model, inferences on the heterogeneity structu ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In analyzing longitudinal or clustered data with a mixed effects model (Laird and Ware, 1982), one may be concerned about violations of normality. Such violations can potentially impact subset selection for the fixed and random effects components of the model, inferences on the heterogeneity structure, and the accuracy of predictions. This article focuses on Bayesian methods for subset selection in nonparametric random effects models in which one is uncertain about the predictors to be included and the distribution of their random effects. We characterize the unknown distribution of the individual-specific regression coefficients using a weighted sum of Dirichlet process (DP)-distributed latent variables. By using carefully-chosen mixture priors for coefficients in the base distributions of the component DPs, we allow fixed and random effects to be effectively dropped out of the model. A stochastic search Gibbs sampler is developed for posterior computation, and the methods are illustrated using simulated data and real data from a multi-laboratory bioassay study.

