Results 11  20
of
148
All Maps of Parameter Estimates Are Misleading
 Statistics in Medicine
, 1998
"... Maps are frequently used to display spatial distributions of parameters of interest, such as cancer rates or average pollutant concentrations by county. It's well known that plotting observed rates can have serious drawbacks when sample sizes vary by area, since very high (and low) observed rates ar ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
Maps are frequently used to display spatial distributions of parameters of interest, such as cancer rates or average pollutant concentrations by county. It's well known that plotting observed rates can have serious drawbacks when sample sizes vary by area, since very high (and low) observed rates are found disproportionately in poorlysampled areas. Unfortunately, adjusting the observed rates to account for the effects of smallsample noise can introduce an opposite effect, in which the highest adjusted rates tend to be found disproportionately in wellsampled areas. In either case, the maps can be difficult to interpret because the display of spatial variation in the underlying parameters of interest is confounded with spatial variation in sample sizes. As a result, spatial patterns occur in adjusted rates even if there is no spatial structure in the underlying parameters of interest, and adjusted rates tend to look too uniform in areas with little data. We introduce two models (normal...
Exploratory Data Analysis for Complex Models
, 2002
"... Exploratory" and "confirmatory" data analysis can both be viewed as methods for comparing observed data to what would be obtained under an implicit or explicit statistical model. ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
Exploratory" and "confirmatory" data analysis can both be viewed as methods for comparing observed data to what would be obtained under an implicit or explicit statistical model.
A statespace model for National Football League scores
 Journal of the American Statistical Association
, 1998
"... This paper develops a predictive model for National Football League (NFL) game scores using data from the period 19881993. The parameters of primary interest, measures of team strength, are expected to vary over time. Our model accounts for this source of variability by modeling football outcomes ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
This paper develops a predictive model for National Football League (NFL) game scores using data from the period 19881993. The parameters of primary interest, measures of team strength, are expected to vary over time. Our model accounts for this source of variability by modeling football outcomes using a statespace model that assumes team strength parameters follow a firstorder autoregressive process. Two sources of variation in team strengths are addressed in our model; weektoweek changes in team strength due to injuries and other random factors, and seasontoseason changes resulting from changes in personnel and other longerterm factors. Our model also incorporates a homefield advantage while allowing for the possibility that the magnitude of the advantage may vary across teams. The aim of the analysis is to obtain plausible inferences concerning team strengths and other model parameters, and to predict future game outcomes. Iterative simulation is used to obtain samples fro...
Diagnostic Measures for Model Criticism
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1996
"... ... In this article we present the general outlook and discuss general families of elaborations for use in practice; the exponential connection elaboration plays a key role. We then describe model elaborations for use in diagnosing: departures from normality, goodness of fit in generalized linear mo ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
... In this article we present the general outlook and discuss general families of elaborations for use in practice; the exponential connection elaboration plays a key role. We then describe model elaborations for use in diagnosing: departures from normality, goodness of fit in generalized linear models, and variable selection in regression and outlier detection. We illustrate our approach with two applications.
Document Structure Analysis and Performance Evaluation
, 1999
"... Document Structure Analysis and Performance Evaluation by Jisheng Liang Chair of Supervisory Committee Professor Robert M. Haralick Electrical Engineering The goal of the document structure analysis is to find an optimal solution to partition the set of glyphs on a given document to a hierarchical t ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Document Structure Analysis and Performance Evaluation by Jisheng Liang Chair of Supervisory Committee Professor Robert M. Haralick Electrical Engineering The goal of the document structure analysis is to find an optimal solution to partition the set of glyphs on a given document to a hierarchical tree structure where entities within the hierarchy are associated with their physical properties and semantic labels. In this dissertation, we present a unified document structure extraction algorithm that is probability based, where the probabilities are estimated from an extensive training set of various kinds of measurements of distances between the terminal and nonterminal entities with which the algorithm works. The offline probabilities estimated in the training then drive all decisions in the online segmentation module. An iterative, relaxation like method is used to find the partitioning solution that maximizes the joint probability. This approach can be uniformly apply to the cons...
Diagnostic Checks for DiscreteData Regression Models Using Posterior Predictive Simulations
, 1997
"... Model checking with discrete data regressions can be difficult because usual methods such as residual plots have complicated reference distributions that depend on the parameters in the model. Posterior predictive checks have been proposed as a Bayesian way to average the results of goodnessoffit ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
Model checking with discrete data regressions can be difficult because usual methods such as residual plots have complicated reference distributions that depend on the parameters in the model. Posterior predictive checks have been proposed as a Bayesian way to average the results of goodnessoffit tests in the presence of uncertainty in estimation of the parameters. We try this approach using a variety of discrepancy variables for generalized linear models fit to a historical data set on behavioral learning. We then discuss the general applicability of our findings in the context of a recent applied example on which we have worked. We find that the following discrepancy variables work well, in the sense of being easy to interpret and sensitive to important model failures: (a) structured displays of the entire data set, (b) general discrepancy variables based on plots of binned or smoothed residuals versus predictors, and (c) specific discrepancy variables created based on the particul...
Multiple imputation for model checking: Completeddata plots with missing and latent data
 Biometrics
, 2005
"... Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we extend this approach to model checking by demonstrating the advantages of the use of completeddata model diagnostics on imputed completed datasets. The approach is set in the theoretical framework of Bayesian posterior predictive checks (but, as with missingdata imputation, our methods of missingdata model checking can also be interpreted as “predictive inference ” in a nonBayesian context). We consider the graphical diagnostics within this framework. Advantages of the completeddata approach include: (1) One can often check model fit in terms of quantities that are of key substantive interest in a natural way, which is not always possible using observed data alone. (2) In problems with missing data, checks may be devised that do not require to model the missingness or inclusion mechanism; the latter is useful for the analysis of ignorable but unknown data collection mechanisms, such as are often assumed in the analysis of sample surveys and observational studies. (3) In many problems with latent data, it is possible to check qualitative features of the model (for example, independence of two variables) that can be naturally formalized with the help of the latent data. We illustrate with several applied examples.
Validation of software for Bayesian models using posterior quantiles
 Journal of Computational and Graphical Statistics
, 2006
"... This article presents a simulationbased method designed to establish the computational correctness of software developed to fit a specific Bayesian model, capitalizing on properties of Bayesian posterior distributions. We illustrate the validation technique with two examples. The validation method ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
This article presents a simulationbased method designed to establish the computational correctness of software developed to fit a specific Bayesian model, capitalizing on properties of Bayesian posterior distributions. We illustrate the validation technique with two examples. The validation method is shown to find errors in software when they exist and, moreover, the validation output can be informative about the nature and location of such errors. We also compare our method with that of an earlier approach.
Checking for priordata conflict
 Bayesian Analysis
, 2006
"... Abstract. Inference proceeds from ingredients chosen by the analyst and data. To validate any inferences drawn it is essential that the inputs chosen be deemed appropriate for the data. In the Bayesian context these inputs consist of both the sampling model and the prior. There are thus two possibil ..."
Abstract

Cited by 10 (7 self)
 Add to MetaCart
Abstract. Inference proceeds from ingredients chosen by the analyst and data. To validate any inferences drawn it is essential that the inputs chosen be deemed appropriate for the data. In the Bayesian context these inputs consist of both the sampling model and the prior. There are thus two possibilities for failure: the data may not have arisen from the sampling model, or the prior may place most of its mass on parameter values that are not feasible in light of the data (referred to here as priordata conflict). Failure of the sampling model can only be fixed by modifying the model, while priordata conflict can be overcome if sufficient data is available. We examine how to assess whether or not a priordata conflict exists, and how to assess when its effects can be ignored for inferences. The concept of priordata conflict is seen to lead to a partial characterization of what is meant by a noninformative prior or a noninformative sequence of priors.