## Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation (2000)

Venue: | American Political Science Review |

Citations: | 147 - 40 self |

### BibTeX

@ARTICLE{King00analyzingincomplete,

author = {Gary King and James Honaker and Anne Joseph and Kenneth Scheve},

title = {Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation},

journal = {American Political Science Review},

year = {2000},

volume = {95},

pages = {49--69}

}

### Years of Citing Articles

### OpenURL

### Abstract

We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one's explanatory and dependent variables than the methods currently used in applied data analysis. The reason for this discrepancy lies with the fact that the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and demanding of considerable expertise. In this paper, we adapt an existing algorithm, and use it to implement a generalpurpose, multiple imputation model for missing data. This algorithm is considerably faster and easier to use than the leading method recommended in the statistics literature. We also quantify the risks of current missing data practices, ...

### Citations

4856 |
Neural Networks for Pattern Recognition
- Bishop
- 1995
(Show Context)
Citation Context ...near imputation model, then it may be worth developing an application-specific approach. Neural network models provide one such example that cannot be handled easily within the EMis imputation stage (=-=Bishop, 1995-=-). Finally, extreme distributional divergences from multivariate normal can also be a good reason to consider an alternative approach. Ordinal and dichotomous variables will often do well under EMis, ... |

1250 | Bayesian Data Analysis - Gelman, Carlin, et al. - 1995 |

948 | The EM Algorithm and Extensions - McLachlan, Krishnan - 1997 |

746 |
Sampling-based approaches to calculating marginal densities
- Gelfand, Smith
- 1990
(Show Context)
Citation Context ... importance resampling (or "sampling importance/resampling"), an iterative simulation technique not based on Markov chains, to try to improve the small sample performance (Rubin, 1987a: 192-=-=4, 1987b; Gelfand and Smith, 1990-=-; Tanner, 1996; Gelman et al., 1996; Wei and Tanner, 1990). EMis follows the same steps as EMs except that draws of ` from its asymptotic distribution are treated only as first approximations to the t... |

660 |
Multiple Imputation for Nonresponse in Surveys
- Rubin
- 1987
(Show Context)
Citation Context ... improve EMs with a round of importance resampling (or "sampling importance/resampling"), an iterative simulation technique not based on Markov chains, to try to improve the small sample per=-=formance (Rubin, 1987-=-a: 192-4, 1987b; Gelfand and Smith, 1990; Tanner, 1996; Gelman et al., 1996; Wei and Tanner, 1990). EMis follows the same steps as EMs except that draws of ` from its asymptotic distribution are treat... |

556 | The calculation of posterior distributions by data augmentation - Tanner, Wong - 1987 |

514 |
Advanced Econometrics
- Amemiya
- 1985
(Show Context)
Citation Context ... Approaches Application-specific approaches usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; Heckman, 1976; =-=Amemiya, 1985-=-: ch. 10; King, 1989: ch. 7; Winship and Mare, 1992). Such models have the advantage of including all information in the estimation. Unfortunately, almost all application-specific models allow missing... |

378 |
Analysis of incomplete multivariate data
- Schafer
- 1997
(Show Context)
Citation Context ...t the additional outside information in an application-specific NI model (see Appendix B.1) would not add much, and may be outweighed by the costs of non-robustness and difficulty of use (Rubin 1996, =-=Schafer 1997-=-) . Although this is surely not true in every application, the advantages make this approach an attractive option for a wide range of potential uses. The MAR assumption can also be made more realistic... |

360 |
The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models
- Heckman
- 1976
(Show Context)
Citation Context ...cation-Specific Approaches Application-specific approaches usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; =-=Heckman, 1976-=-; Amemiya, 1985: ch. 10; King, 1989: ch. 7; Winship and Mare, 1992). Such models have the advantage of including all information in the estimation. Unfortunately, almost all application-specific model... |

267 |
Inference and missing data
- Rubin
- 1976
(Show Context)
Citation Context ... (for historical reasons). We try to clarify with Table 1 where each missingness process is characterized according to our ability to predict the values of M (i.e., which values of D will be missing)(=-=Rubin 1976-=-). For example, missing values in processes that are missing completely at random (MCAR) cannot be predicted any better with information in D, observed or not. More formally, M is independent of D: P ... |

232 | Markov chain monte carlo convergence diagnostics
- Cowles, P
- 1996
(Show Context)
Citation Context ...s of D, multiple imputation approaches to missing data problems. 4 Although software exists to check convergences, there is significant debate on how adequate these methods are (see Kass et al. 1998; =-=Cowles and Carlin 1996-=-). 5 Public domain software accompanying Schafer's (1997) superb book implements monotone data augmentation by the IP algorithm (see Section 4.3.1), the best currently available approach (Rubin and Sc... |

190 |
Covariance structure of the gibbs sampler with applications to the comparison of estimators and augmentation schemes. Biometrika
- Liu, J, et al.
- 1994
(Show Context)
Citation Context ...omain software accompanying Schafer's (1997) superb book implements monotone data augmentation by the IP algorithm (see Section 4.3.1), the best currently available approach (Rubin and Schafer, 1990; =-=Liu, Wong, and Kong, 1994-=-). The commercial programs Solas and SPlus have promised implementations. SPSS has released a missing data module, but the program only produces sufficient statistics so data analysis methods that req... |

169 | Making the most of statistical analyses: Improving interpretation and presentation - King, Tomz, et al. - 2000 |

153 |
A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms
- Wei, Tanner
- 1990
(Show Context)
Citation Context ..., an iterative simulation technique not based on Markov chains, to try to improve the small sample performance (Rubin, 1987a: 192-4, 1987b; Gelfand and Smith, 1990; Tanner, 1996; Gelman et al., 1996; =-=Wei and Tanner, 1990-=-). EMis follows the same steps as EMs except that draws of ` from its asymptotic distribution are treated only as first approximations to the true (finite sample) posterior. We also put the parameters... |

151 |
Multiple imputation after 18+ years
- Rubin
- 1996
(Show Context)
Citation Context ...ation so that the additional outside information in an application-specific NI model (see Appendix B.1) would not add much, and may be outweighed by the costs of non-robustness and difficulty of use (=-=Rubin 1996-=-, Schafer 1997) . Although this is surely not true in every application, the advantages make this approach an attractive option for a wide range of potential uses. The MAR assumption can also be made ... |

135 |
Tools for Statistical inference: methods for exploration of posterior distributions and likelihood functions, 3rd edition
- Tanner
- 1996
(Show Context)
Citation Context ...stribution of D mis . The problem is that the posterior ofsand \Sigma is hard to draw from. We approach this problem in two different ways. In this section, we use the asymptotic approximation (e.g., =-=Tanner, 1996-=-: 54--59), which we find works as expected --- well in large data sets and poorly in small ones. To create imputations with this method, which we denote EMs, we first run EM to find the maximum poster... |

95 |
A Course in Econometrics
- Goldberger
- 1991
(Show Context)
Citation Context ... 1 (since estimating additional parameters puts more demands on the data). Thus, the mean square error (a combination of bias and variance) may in some cases increase by including a control variable (=-=Goldberger, 1991-=-: 256). Fortunately, since we typically have a large number of observations, adding an extra variable does not do much harm so long as it does not introduce substantial collinearity and we often inclu... |

68 |
Markov chain Monte Carlo in practice: a round table discussion
- Kass, Carlin, et al.
- 1998
(Show Context)
Citation Context ...nd missing portions of D, multiple imputation approaches to missing data problems. 4 Although software exists to check convergences, there is significant debate on how adequate these methods are (see =-=Kass et al. 1998-=-; Cowles and Carlin 1996). 5 Public domain software accompanying Schafer's (1997) superb book implements monotone data augmentation by the IP algorithm (see Section 4.3.1), the best currently availabl... |

68 |
Regression With Missing X's: A Review
- Little
- 1992
(Show Context)
Citation Context ...t the process by which data become missing and then discuss, briefly in the conclusion to this section and more extensively in subsequent sections, how the various methods crucially depend upon them (=-=Little 1992-=-). First let D denote the Data matrix, which includes the dependent Y and explanatory X variables: D = fY; Xg. If D were fully observed, we could use a standard statistical method to analyze it and ig... |

58 |
Unifying Political Methodology: The Likelihood Theory of Statistical Inference. Ann Arbor
- King
- 1989
(Show Context)
Citation Context ...-specific approaches usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; Heckman, 1976; Amemiya, 1985: ch. 10; =-=King, 1989-=-: ch. 7; Winship and Mare, 1992). Such models have the advantage of including all information in the estimation. Unfortunately, almost all application-specific models allow missingness only in or rela... |

53 |
Posterior Predictive p-Values
- Meng
- 1994
(Show Context)
Citation Context ... Common Misconceptions Multiple imputation inferences have been shown to be statistically valid from both Bayesian and frequentist perspectives (Rubin 1987a; Schenker and Welsh 1988; Brownstone 1991; =-=Meng 1994-=-a; Rubin 1996; Schafer 1997). Since there is some controversy over the strength 13 For difficult cases, our software allows the user to substitute the heavier tailed t for the approximating density. T... |

52 |
Uninformed Votes: Information Effects in Presidential Elections
- Bartels
- 1996
(Show Context)
Citation Context ...but are unobserved, although imputing values that the respondent really does not know can be of interest in specific applications, such as predicting how people would vote if they were more informed (=-=Bartels, 1996-=-). Finally, let D obs and D mis denote observed and missing portions of D, multiple imputation approaches to missing data problems. 4 Although software exists to check convergences, there is significa... |

47 | Multiple imputation for multivariate missing-data problems: A data analyst’s perspective
- Schafer, Olsen
- 1998
(Show Context)
Citation Context ...ciple, but they have not made it into the toolbox of more than a few applied statisticians or social scientists. In fact, outside of the experts, "the method has remained largely unknown and unus=-=ed" (Schafer and Olsen, 1998-=-). The problem is only in part a lack of information and training. A bigger issue is that although this method is easy to use in theory, it requires in practice computational algorithms that can take ... |

41 |
The Statistical Analysis of Quasi-Experiments
- Achen
- 1986
(Show Context)
Citation Context ...iple imputation. B.1 Application-Specific Approaches Application-specific approaches usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (=-=Achen, 1986-=-; Brehm, 1993; Heckman, 1976; Amemiya, 1985: ch. 10; King, 1989: ch. 7; Winship and Mare, 1992). Such models have the advantage of including all information in the estimation. Unfortunately, almost al... |

37 |
Missing data
- Little, Roderick, et al.
- 1995
(Show Context)
Citation Context ...licity, we assume that if a dataset meets the MAR assumption, it also meets the distinctness condition and is therefore ignorable. 3 are also unbiased and efficient under MAR (Little and Rubin, 1989; =-=Little and Schenker, 1995-=-). Both listwise deletion and basic multiple imputation approaches can be biased under NI, in which case additional steps or different models (discussed in Section 5 and Appendix B.1) must be taken to... |

37 |
Multiple-imputation inferences with uncongenial sources of input (with discussion
- Meng
- 1994
(Show Context)
Citation Context ... Common Misconceptions Multiple imputation inferences have been shown to be statistically valid from both Bayesian and frequentist perspectives (Rubin 1987a; Schenker and Welsh 1988; Brownstone 1991; =-=Meng 1994-=-a; Rubin 1996; Schafer 1997). Since there is some controversy over the strength 13 For difficult cases, our software allows the user to substitute the heavier tailed t for the approximating density. T... |

37 |
A noniterative sampling/importance resampling alternative to the data augmentation algorithm for creating a few imputations when the fraction of missing information is modest: the SIR algorithm. Discussion of
- Rubin
- 1987
(Show Context)
Citation Context ... improve EMs with a round of importance resampling (or "sampling importance/resampling"), an iterative simulation technique not based on Markov chains, to try to improve the small sample per=-=formance (Rubin, 1987-=-a: 192-4, 1987b; Gelfand and Smith, 1990; Tanner, 1996; Gelman et al., 1996; Wei and Tanner, 1990). EMis follows the same steps as EMs except that draws of ` from its asymptotic distribution are treat... |

31 |
Ideology and the theory of political choice. Ann Arbor
- Hinich, Munger
- 1994
(Show Context)
Citation Context ...re-replication" (i.e., prior to publication) of Timothy Colton's (in press) test of his extensive model of vote choice in Russia's 1995 parliamentary and 1996 22 Consistent with the literature (e=-=.g., Hinich and Munger 1994-=-), we assume that ideology measures an individual's underlying policy preferences. Assuming that an individual has at least some policy views, they have an ideology, whether or not they are willing an... |

31 |
The analysis of social science data with missing values
- Little, Rubin
- 1990
(Show Context)
Citation Context ...ext, for expository simplicity, we assume that if a dataset meets the MAR assumption, it also meets the distinctness condition and is therefore ignorable. 3 are also unbiased and efficient under MAR (=-=Little and Rubin, 1989-=-; Little and Schenker, 1995). Both listwise deletion and basic multiple imputation approaches can be biased under NI, in which case additional steps or different models (discussed in Section 5 and App... |

29 | A Statistical Model for Multiparty Electoral Data - Katz, King - 1999 |

29 | A unified model of cabinet dissolution in parliamentary democracies - King, Alt, et al. - 1990 |

28 | Performing Likelihood Ratio Tests With MultiplyImputed Data Sets - Meng, Rubin - 1992 |

23 | On Variance Estimation With Imputed Survey Data - Rao - 1996 |

22 |
Formalizing Subjective Notions About the Effect of Nonrespondents in Sample Surveys
- Rubin
- 1977
(Show Context)
Citation Context ...ing answers in combination with listwise deletion, we favor another procedure based on the concept of "multiple imputation" that is nearly as easy to use but avoids the problems of current p=-=ractices (Rubin, 1977-=-). 3 Multiple 1 These numbers come from our content analysis of five years (1993--97) of the American Political Science Review, the American Journal of Political Science, and the British Journal of Po... |

19 | Multiple imputation of industry and occupation codes in census public-use samples using Bayesian logistic regression - Clogg, Rubin, et al. - 1991 |

17 |
The Phantom Respondents: Opinion Surveys and Political Representation. Ann Arbor
- Brehm
- 1993
(Show Context)
Citation Context ...on. B.1 Application-Specific Approaches Application-specific approaches usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; =-=Brehm, 1993-=-; Heckman, 1976; Amemiya, 1985: ch. 10; King, 1989: ch. 7; Winship and Mare, 1992). Such models have the advantage of including all information in the estimation. Unfortunately, almost all application... |

16 |
A split questionnaire survey design
- Raghunathan, Grizzle
- 1995
(Show Context)
Citation Context ...ombination of item and unit nonresponse. Some examples include entire variables missing from one of a series of cross-sectional surveys (Franklin, 1989; Gelman, King, and Liu, 1998), matrix sampling (=-=Raghunathan and Grizzle, 1995-=-), panel attrition, etc. 3 The most useful modern work on the subject related to our approach is Schafer (1997), which we rely on frequently. Schafer's book provides a detailed guide to the analysis o... |

13 |
Models for Sample Selection Bias
- Winship, Mare
- 1992
(Show Context)
Citation Context ...s usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; Heckman, 1976; Amemiya, 1985: ch. 10; King, 1989: ch. 7; =-=Winship and Mare, 1992-=-). Such models have the advantage of including all information in the estimation. Unfortunately, almost all application-specific models allow missingness only in or related to Y rather than scattered ... |

10 | Are Americans Ambivalent Toward Racial Policies - Alvarez, Brehm - 1997 |

10 |
Efficiently creating multiple imputations for incomplete multivariate normal data
- Rubin, Schafer
- 1990
(Show Context)
Citation Context ... Carlin 1996). 5 Public domain software accompanying Schafer's (1997) superb book implements monotone data augmentation by the IP algorithm (see Section 4.3.1), the best currently available approach (=-=Rubin and Schafer, 1990-=-; Liu, Wong, and Kong, 1994). The commercial programs Solas and SPlus have promised implementations. SPSS has released a missing data module, but the program only produces sufficient statistics so dat... |

9 | Partisan cues and the media: Information flows in the 1992 presidential election - Dalton, Beck, et al. - 1998 |

9 | Structure, behavior, and voter turnout in the united states - Timpone - 1998 |

7 |
When are inferences from multiple imputation valid
- Fay
- 1992
(Show Context)
Citation Context ...variables (and information) in the analysis model, no bias is introduced and nominal confidence interval coverage will be at least as great as actual coverage, and equal when the two models coincide (=-=Fay, 1992). When th-=-e information content is greater in the imputation than analysis model, multiple imputation is more efficient than even the "optimal" application-specific method. 14 Thus, even with a very s... |

7 | Amelia: A Program for Missing Data - Honaker, Joseph, et al. - 2001 |

7 |
Theory testing in a world of constrained research design: the significance of heckman’s censored sampling bias correction for nonexperimental research
- Stolzenberg, Relles
- 1990
(Show Context)
Citation Context ... assumptions apply, application-specific approaches are consistent and maximally efficient. However, in some cases inferences from these models tend to be sensitive to small changes in specification (=-=Stolzenberg and Relles, 1990-=-). Moreover, different models must be used for each type of application. As a result, with new types of data, application-specific methods are most likely to be used by those willing to devote more ti... |

6 | A simulation study to evaluate the performance of model-based multiple imputations in NCHS health examination surveys - Little, Ezzati-Rice, et al. - 1995 |

4 |
What We Know About 'Don't Knows': An Analysis of Seven Point Issue Placements." Paper presented at the poster session
- Globetti
- 1997
(Show Context)
Citation Context ...he optimistic MCAR assumption, the degree of error will often be more than a standard error, and its direction will vary as a function of the application, pattern of missingness, and model estimated (=-=Globetti, 1997-=-; Sherman, 1998). Fortunately, better methods make this forced choice between suboptimal procedures unnecessary. 4 A Method for Analyzing Incomplete Data We now describe a general definition of multip... |

4 | Voting, Abstention, and Individual Expectations in the 1992 Presidential Election.” Working Paper, Northwestern University. 44 - Herron - 1998 |

4 | ÒMultiple Imputation of Missing Data - Schafer - 1993 |

3 | Panel Attrition and Panel Conditioning in American National Election Studies" paper prepared for the 1998 meetings of the Society for Political Methodology - Bartels - 1998 |