Results 1  10
of
16
A SimulationIntensive Approach for Checking Hierarchical Models
 TEST
, 1998
"... Recent computational advances have made it feasible to fit hierarchical models in a wide range of serious applications. If one entertains a collection of such models for a given data set, the problems of model adequacy and model choice arise. We focus on the former. While model checking usually addr ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Recent computational advances have made it feasible to fit hierarchical models in a wide range of serious applications. If one entertains a collection of such models for a given data set, the problems of model adequacy and model choice arise. We focus on the former. While model checking usually addresses the entire model specification, model failures can occur at each hierarchical stage. Such failures include outliers, mean structure errors, dispersion misspecification, and inappropriate exchangeabilities. We propose another approach which is entirely simulation based. It only requires the model specification and that, for a given data set, one be able to simulate draws from the posterior under the model. By replicating a posterior of interest using data obtained under the model we can "see" the extent of variability in such a posterior. Then, we can compare the posterior obtained under the observed data with this medley of posterior replicates to ascertain whether the former is in agr...
Simulation Based Model Checking for Hierarchical Models
 Test
, 1995
"... Recent computational advances have made it feasible to t hierarchical models in a wide range of serious applications. If one entertains a collection of such models for a given data set, the problems of model adequacy and model choice arise. We focus on the former. While model checking usually addres ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Recent computational advances have made it feasible to t hierarchical models in a wide range of serious applications. If one entertains a collection of such models for a given data set, the problems of model adequacy and model choice arise. We focus on the former. While model checking usually addresses the entire model specication, model failures can occur at each hierarchical stage. Such failures include outliers, mean structure errors, dispersion misspecication, and inappropriate exchangeabilities. We propose another approach which is entirely simulation based. It only requires the model specication and that, for a given data set, one be able to simulate draws from the posterior under the model. By replicating a posterior of interest using data obtained under the model we can \see " the extent of variability in such a posterior. Then, we can compare the posterior obtained under the observed data with this medley of posterior replicates to ascertain whether the former is in agreement with them and accordingly, whether it is plausible that the observed data came from the proposed model. This suggests the large scale use of Monte Carlo tests, each focusing on a potential model failure. It thus suggests the possibility of examining not only the overall adequacy of the hierarchical model but, using suitable posteriors, the adequacy of each stage. 1 This raises the question of when individual stages are separable and checkable which we explore in some detail. Finally, we develop this strategy in the context of generalized linear mixed models and oer a simulation study to demonstrate its capabilities.
Bayes Estimate and Inference for Entropy and Information Index of Fit
"... KullbackLeibler information is widely used for developing indices of distributional fit. The most celebrated of such indices is Akaike’s AIC, which is derived as an estimate of the minimum KullbackLeibler information between the unknown datagenerating distribution and a parametric model. In the d ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
KullbackLeibler information is widely used for developing indices of distributional fit. The most celebrated of such indices is Akaike’s AIC, which is derived as an estimate of the minimum KullbackLeibler information between the unknown datagenerating distribution and a parametric model. In the derivation of AIC, the entropy of the datagenerating distribution is bypassed because it is free from the parameters. Consequently, the AIC type measures provide criteria for model comparison purposes only, and do not provide information diagnostic about the model fit. A nonparametric estimate of entropy of the datagenerating distribution is needed for assessing the model fit. Several entropy estimates are available and have been used for frequentist inference about information fit indices. A few entropybased fit indices have been suggested for Bayesian inference. This paper develops a class of entropy estimates and provides a procedure for Bayesian inference on the entropy and a fit index. For the continuous case, we define a quantized entropy that approximates and converges to the entropy integral. The quantized entropy includes some well known measures of sample entropy and the existing Bayes entropy estimates as its special cases. For inference about the fit, we use the candidate model as the expected distribution in the Dirichlet process prior and derive the posterior mean of the quantized entropy as the Bayes estimate. The maximum entropy characterization of the candidate model is then used to derive the prior and posterior distributions for the KullbackLeibler information index of fit. The consistency of the proposed Bayes estimates for the entropy and for the information index are shown. As byproducts, the procedure also produces priors and posteriors for the model parameters and the moments.
Parsimonious Estimation of Multiplicative Interaction in Analysis of Variance using KullbackLeibler Information
 Journal of Statistical Planning and Inference
, 1999
"... Many standard methods for modeling interaction in two way ANOVA require mn interaction parameters, where m and n are the number of rows and columns in the table. By viewing the interaction parameters as a matrix and performing a singular value decomposition, one arrives at the Additive Main Effec ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Many standard methods for modeling interaction in two way ANOVA require mn interaction parameters, where m and n are the number of rows and columns in the table. By viewing the interaction parameters as a matrix and performing a singular value decomposition, one arrives at the Additive Main Effects and Multiplicative Interaction (AMMI) model which is commonly used in agriculture. By using only those interaction components with the largest singular values, one can produce an estimate of interaction that requires far fewer than mn parameters while retaining most of the explanatory power of standard methods. The central inference problems of estimating the parameters and determining the number of interaction components has been difficult except in "ideal" situations (equal cell sizes, equal variance, etc.). The Bayesian methodology developed in this paper applies for unequal sample sizes and heteroscedastic data, and may be easily generalized to more complicated data structures...
Information measures in Perspective
, 2010
"... Informationtheoretic methodologies are increasingly being used in various disciplines. Frequently an information measure is adapted for a problem, yet the perspective of information as the unifying notion is overlooked. We set forth this perspective through presenting informationtheoretic methodol ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Informationtheoretic methodologies are increasingly being used in various disciplines. Frequently an information measure is adapted for a problem, yet the perspective of information as the unifying notion is overlooked. We set forth this perspective through presenting informationtheoretic methodologies for a set of problems in probability and statistics. Our focal measures are Shannon entropy and KullbackLeibler information. The background topics for these measures include notions of uncertainty and information, their axiomatic foundation, interpretations, properties, and generalizations. Topics with broad methodological applications include discrepancy between distributions, derivation of probability models, dependence between variables, and Bayesian analysis. More specific methodological topics include model selection, limiting distributions, optimal prior distribution and design of experiment, modeling duration variables, order statistics, data disclosure, and relative importance of predictors. Illustrations range from very basic to highly technical ones that draw attention to subtle points.
Strategies for Inference Robustness in Complex Modelling: An Application to Longitudinal Performance Measures.
, 1999
"... Advances in computation mean it is now possible to fit a wide range of complex models, but selecting a model on which to base reported inferences is a difficult problem. Following an early suggestion of Box and Tiao, it seems reasonable to seek `inference robustness' in reported models, so t ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Advances in computation mean it is now possible to fit a wide range of complex models, but selecting a model on which to base reported inferences is a difficult problem. Following an early suggestion of Box and Tiao, it seems reasonable to seek `inference robustness' in reported models, so that alternative assumptions that are reasonably well supported would not lead to substantially different conclusions. We propose a fourstage modelling strategy in which we: iteratively assess and elaborate an initial model, measure the support for each of the resulting family of models, assess the influence of adopting alternative models on the conclusions of primary interest, and identify whether an approximate model can be reported. These stages are semiformal, in that they are embedded in a decisiontheoretic framework but require substantive input for any specific application. The ideas are illustrated on a dataset comprising the success rates of 46 invitro fertilisation clinics over three years. The analysis supports a model that assumes 43 of the 46 clinics have odds on success that are evolving at a constant proportional rate (i.e. linear on a logit scale), while three clinics are outliers in the sense of showing nonlinear trends. For the 43 `linear' clinics, the intercepts and gradients can be assumed to follow a bivariate normal distribution except for one outlying intercept: the odds on success are significantly increasing for four clinics and significantly decreasing for three. This model displays considerable inference robustness and, although its conclusions could be approximated by other lesssupported models, these would not be any more parsimonious. Technical issues include fitting mixture models of alternative hierarchical longitudinal models, t...
Detecting StageWise Outliers in Hierarchical Bayesian Linear Models of Repeated Measures Data
, 2003
"... for multistage models. We propose numerical and graphical methods for outlier detection in hierarchical Bayes modeling and analyses of repeated measures regression data from multiple subjects; data from a single subject are generically called a “curve. ” The firststage of our model has curvespeci ..."
Abstract
 Add to MetaCart
for multistage models. We propose numerical and graphical methods for outlier detection in hierarchical Bayes modeling and analyses of repeated measures regression data from multiple subjects; data from a single subject are generically called a “curve. ” The firststage of our model has curvespecific regression coefficients with possibly autoregressive errors of a prespecified order. The firststage regression vectors for different curves are linked in a secondstage modeling step, possibly involving additional regression variables. Detection of the stage at which the curve appears to be an outlier and the magnitude and specific component of the violation at that stage is accomplished by embedding the null model into a larger parametric model that can accommodate such unusual observations. As a first diagnostic, we examine the posterior probabilities of firststage and secondstage anomalies relative to the modeling assumptions for each curve. For curves where there is evidence of a model violation at either stage, we propose additional numerical and graphical diagnostics. For firststage violations, the diagnostics identify the specific measurements within a curve that are anomalous. For secondstage violations, the diagnostics identify the curve parameters that are unusual relative to the pattern of parameter values for the majority of the curves. We give two examples to illustrate the diagnostics, develop a BUGS program to compute them using MCMC techniques, and examine the sensitivity of the conclusions to the prior modeling assumptions. 2 1
A Dirichlet Process Elaboration Diagnostic for Binomial Goodness of Fit
"... Useful model checking tools can be constructed by measuring the distance between a prior distribution that concentrates most of its mass around a model of interest and the resulting posterior distribution. In this paper we use this approach to construct a diagnostic measure for detecting lack of fit ..."
Abstract
 Add to MetaCart
Useful model checking tools can be constructed by measuring the distance between a prior distribution that concentrates most of its mass around a model of interest and the resulting posterior distribution. In this paper we use this approach to construct a diagnostic measure for detecting lack of fit in discrete data, with special focus on binomial data. We begin by constructing a suitable probability model "around" the model of interest, via a Dirichlet Process elaboration. We derive the resulting diagnostics and show that, approximately, it is the sum of two terms: the first is the logarithm of the Bayes factor and the second is proportional to the Pearson chisquare statistics. We give details of a simulation algorithm for computing the diagnostic and illustrate its use in an application to biomedical data. Keywords: Bayesian model criticism, binomial data, logarithmic divergence, chisquare statistics. Running Title: Binomial Goodness of Fit 1 Introduction Diagnostic measures are ...