## Not Asked Or Not Answered: Multiple Imputation for Multiple Surveys (1998)

Venue: | Journal of the American Statistical Association |

Citations: | 21 - 8 self |

### BibTeX

@ARTICLE{Gelman98notasked,

author = {Andrew Gelman and Gary King and Chuanhai Liu},

title = {Not Asked Or Not Answered: Multiple Imputation for Multiple Surveys},

journal = {Journal of the American Statistical Association},

year = {1998},

volume = {93},

pages = {846--874}

}

### Years of Citing Articles

### OpenURL

### Abstract

We present a method of analyzing a series of independent cross-sectional surveys in which some questions are not answered in some surveys and some respondents do not answer some of the questions posed. The method is also applicable to a single survey in which different questions are asked, or different sampling methods used, in different strata or clusters. Our method involves multiply-imputing the missing items and questions by adding to existing methods of imputation designed for single surveys a hierarchical regression model that allows covariates at the individual and survey levels. Information from survey weights is exploited by including in the analysis the variables on which the weights were based, and then reweighting individual responses (observed and imputed) to estimate population quantities. We also develop diagnostics for checking the fit of the imputation model based on comparing imputed to nonimputed data. We illustrate with the example that motivated this project --- a ...

### Citations

3720 | ªStochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images,º - Geman, Geman - 1984 |

1123 |
DB: Statistical Analysis with Missing Data
- RJA, Rubin
- 1987
(Show Context)
Citation Context ...ogies of Bush and Dukakis were asked in fewer than half of the surveys, and they were excluded from the analysis. Gelman and King (1993) used a mixture of availablecase and complete-case methods (see =-=Little and Rubin, 1987-=-)---available-case for the timeseries plots by subgroup, and complete-case for the regressions. Compared to complete-case inference, these analyses are more difficult to set up---one must examine the ... |

744 |
Sampling-based approaches to calculating marginal densities
- Gelfand, Smith
- 1990
(Show Context)
Citation Context ...otone data pattern for the pre-election surveys, with the variables arranged in decreasing order of proportion of missing data. We use the Gibbs sampler (Geman and Geman, 1994; Tanner and Wong, 1987; =-=Gelfand and Smith, 1990), an it-=-erative algorithm for obtaining draws of a set of m variables �� 1 ; : : : ; �� m from their joint distribution. Each iteration of the Gibbs sampler consists of a sequence of steps, each takes... |

652 |
Multiple Imputation for Nonresponse in Surveys
- Rubin
- 1987
(Show Context)
Citation Context ...tLib data archive. saturated models, to impute missing data, with the understanding that once the imputations have been obtained, later users can analyze the completed data sets as they see fit. (See =-=Rubin, 1987-=-, 1996, Belin et al., 1993, and Meng, 1994. Also see Rao, 1996, and Fay, 1996, for critical perspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a sin... |

549 |
The calculation of posterior distributions by data augmentation
- Tanner, Wong
- 1987
(Show Context)
Citation Context ...e data can be sorted in such a way that y i;j is observed if y i+1;j is observed for j = 1; : : : ; m and i = 1; : : : ; n \Gamma 1: MDA is the algorithm that applies the data augmentation algorithm (=-=Tanner and Wong, 1987-=-) to a (complete) monotone-pattern data, which is created by including those missing values that destroy the monotone pattern. MDA promises fast converging iterative simulation methods by disregarding... |

377 |
JL: Analysis of incomplete multivariate data. Monographs on Statistics and applied Probability 72. Boca Raton USA
- Schafer
- 1997
(Show Context)
Citation Context ..., 1996, for critical perspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a single sample survey based on normal (Rubin and Schafer, 1990, Liu, 1993, =-=Schafer, 1997-=-) and t (Liu, 1995) distributions and the general location model (Schafer, 1997, Liu and Rubin, 1998). When imputing missing data from several sample surveys, there are two obvious ways to use existin... |

333 |
Survey Sampling
- Kish
- 1965
(Show Context)
Citation Context .... Another relevant area of application is stratified and cluster sampling. Appropriate analysis of sample surveys includes information used in the design, including stratification and clustering (see =-=Kish, 1965-=-, Gelman et al., 1995, and Rubin, 1996, for perspectives from survey sampling practice, Bayesian inference, and multiple-imputation inference, respectively). If strata or clusters are expected to diff... |

261 |
Inference and Missing Data
- Rubin
- 1976
(Show Context)
Citation Context ...Q ) 0 : i = 1; : : : ; N s g : s = 1; : : : ; Sg: (1) We assume that the data are missing at random---that is, that the probability of missingness depends only on observed data included in the model (=-=Rubin, 1976-=-). This is a reasonable assumption here because almost all the missingness is due to unasked questions. If clear violations of missing at random occur (for example, a question about defense policy may... |

187 |
Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes
- Liu, Wong, et al.
- 1994
(Show Context)
Citation Context ...he Gibbs sampler context, this has the effect of analytically integrating over (rather than sampling) the other missing elements in the data matrix, which tends to yield a fasterconverging algorithm (=-=Liu, Wong, and Kong, 1994). It is-=- straightforward to implement Step 1 because, given \Psi; �� 1 ; : : : ; �� S ; fi; \Sigma, and y obs ; the nonresponse components of any of the respondents in ymp;mis is independent of that o... |

150 |
Multiple Imputation After 18+ Years
- Rubin
- 1996
(Show Context)
Citation Context ...n is stratified and cluster sampling. Appropriate analysis of sample surveys includes information used in the design, including stratification and clustering (see Kish, 1965, Gelman et al., 1995, and =-=Rubin, 1996-=-, for perspectives from survey sampling practice, Bayesian inference, and multiple-imputation inference, respectively). If strata or clusters are expected to differ in their mean responses (as will ge... |

72 |
Data analysis using Stein’s estimator and its generalizations
- Efron, Morris
- 1975
(Show Context)
Citation Context ...s in that survey. This effect of partial pooling, with the amount of pooling depending on the amount of available data, is typical of Bayesian inference in hierarchical models or meta-analysis (e.g., =-=Efron and Morris, 1975-=-, Rubin, 1980, DuMouchel and Harris, 1983, Gatsonis et al., 1992, and Belin et al., 1993). The hierarchical regression structure also allows us to include covariates both at the individual and survey ... |

51 | Why are american presidential election campaign polls so variable when votes are so predictable - Gelman, King - 1993 |

34 |
Multiple-imputation inferences with uncongenial sources of input
- Meng
- 1994
(Show Context)
Citation Context ...ute missing data, with the understanding that once the imputations have been obtained, later users can analyze the completed data sets as they see fit. (See Rubin, 1987, 1996, Belin et al., 1993, and =-=Meng, 1994-=-. Also see Rao, 1996, and Fay, 1996, for critical perspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a single sample survey based on normal (Rubin a... |

23 |
On Variance Estimation With Imputed Survey Data
- Rao
- 1996
(Show Context)
Citation Context ...h the understanding that once the imputations have been obtained, later users can analyze the completed data sets as they see fit. (See Rubin, 1987, 1996, Belin et al., 1993, and Meng, 1994. Also see =-=Rao, 1996-=-, and Fay, 1996, for critical perspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a single sample survey based on normal (Rubin and Schafer, 1990, Li... |

18 |
Alternative Paradigms for the Analysis of Imputed Survey Data
- Fay
- 1996
(Show Context)
Citation Context ...ding that once the imputations have been obtained, later users can analyze the completed data sets as they see fit. (See Rubin, 1987, 1996, Belin et al., 1993, and Meng, 1994. Also see Rao, 1996, and =-=Fay, 1996-=-, for critical perspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a single sample survey based on normal (Rubin and Schafer, 1990, Liu, 1993, Schafe... |

17 |
Bayes methods for combining the results of cancer studies in humans and other species
- DuMouchel, Harris
- 1983
(Show Context)
Citation Context ...ial pooling, with the amount of pooling depending on the amount of available data, is typical of Bayesian inference in hierarchical models or meta-analysis (e.g., Efron and Morris, 1975, Rubin, 1980, =-=DuMouchel and Harris, 1983-=-, Gatsonis et al., 1992, and Belin et al., 1993). The hierarchical regression structure also allows us to include covariates both at the individual and survey levels. For an approach to hierarchical r... |

16 |
A split questionnaire survey design
- Raghunathan, Grizzle
- 1995
(Show Context)
Citation Context ...advantage of immediately generalizing to the unsampled clusters. Our method might be particularly appropriate to surveys in which different questions are asked to respondents in different strata (see =-=Raghunathan and Grizzle, 1995-=-). In this paper, we present a specific method for extending a standard multiple imputation algorithm based on multivariate normal models. We illustrate with the example that motivated this work, a st... |

13 |
Missing data imputation using the multivariate t distribution
- Liu
- 1995
(Show Context)
Citation Context ...erspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a single sample survey based on normal (Rubin and Schafer, 1990, Liu, 1993, Schafer, 1997) and t (=-=Liu, 1995-=-) distributions and the general location model (Schafer, 1997, Liu and Rubin, 1998). When imputing missing data from several sample surveys, there are two obvious ways to use existing single-survey me... |

13 |
Using empirical Bayes techniques in the law school validity studies
- Rubin
- 1980
(Show Context)
Citation Context ...ffect of partial pooling, with the amount of pooling depending on the amount of available data, is typical of Bayesian inference in hierarchical models or meta-analysis (e.g., Efron and Morris, 1975, =-=Rubin, 1980-=-, DuMouchel and Harris, 1983, Gatsonis et al., 1992, and Belin et al., 1993). The hierarchical regression structure also allows us to include covariates both at the individual and survey levels. For a... |

12 |
Bartlett's Decomposition of the Posterior Distribution of the Covariance for Normal Monotone Ignorable Missing Data
- Liu
- 1993
(Show Context)
Citation Context ...96, and Fay, 1996, for critical perspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a single sample survey based on normal (Rubin and Schafer, 1990, =-=Liu, 1993-=-, Schafer, 1997) and t (Liu, 1995) distributions and the general location model (Schafer, 1997, Liu and Rubin, 1998). When imputing missing data from several sample surveys, there are two obvious ways... |

12 | Bayesian Robust Multivariate Linear Regression With Incomplete Data - Liu - 1996 |

10 |
Efficiently creating multiple imputations for incomplete multivariate normal data
- Rubin, Schafer
- 1990
(Show Context)
Citation Context ...g, 1994. Also see Rao, 1996, and Fay, 1996, for critical perspectives on multiple imputation). Algorithms are available and in use for imputing missing data in a single sample survey based on normal (=-=Rubin and Schafer, 1990-=-, Liu, 1993, Schafer, 1997) and t (Liu, 1995) distributions and the general location model (Schafer, 1997, Liu and Rubin, 1998). When imputing missing data from several sample surveys, there are two o... |

8 | K Reckhow, Combining information from related regressions - Dominici, Parmigiani, et al. - 1997 |

7 |
Ellipsoidally symmetric extensions of the general location model for mixed categorical and continuous data
- Liu, Rubin
- 1998
(Show Context)
Citation Context ... for imputing missing data in a single sample survey based on normal (Rubin and Schafer, 1990, Liu, 1993, Schafer, 1997) and t (Liu, 1995) distributions and the general location model (Schafer, 1997, =-=Liu and Rubin, 1998-=-). When imputing missing data from several sample surveys, there are two obvious ways to use existing single-survey methods: (1) separately imputing the missing data from each survey, or (2) combining... |

6 | Estimation across data sets: Twostage auxiliary instrumental variables estimation - Franklin - 1989 |

4 |
Geographic variation of procedure utilization: a hierarchical model approach
- Gatsonis, Normand, et al.
- 1993
(Show Context)
Citation Context ... of pooling depending on the amount of available data, is typical of Bayesian inference in hierarchical models or meta-analysis (e.g., Efron and Morris, 1975, Rubin, 1980, DuMouchel and Harris, 1983, =-=Gatsonis et al., 1992-=-, and Belin et al., 1993). The hierarchical regression structure also allows us to include covariates both at the individual and survey levels. For an approach to hierarchical regression using econome... |