## Listwise deletion is evil: What to do about missing data in political science (1998)

Venue: | Paper Presented at the Annual Meeting of the American Political Science Association |

Citations: | 16 - 2 self |

### BibTeX

@INPROCEEDINGS{King98listwisedeletion,

author = {Gary King and James Honaker and Anne Joseph and Kenneth Scheve and Mike Alvarez and John Barnard and Neal Beck and Larry Bartels and Ted Brader and Charles Franklin and Rob Van Houweling and Jas Sekhon and Brian Silver},

title = {Listwise deletion is evil: What to do about missing data in political science},

booktitle = {Paper Presented at the Annual Meeting of the American Political Science Association},

year = {1998}

}

### OpenURL

### Abstract

We propose a remedy to the substantial discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. With a few notable exceptions, statisticians and methodologists have agreed on a widely applicable approach to many missing data problems based on the concept of \multiple imputation, " but most researchers in our eld and other social sciences still use far inferior methods. Indeed, we demonstrate that the threats to validity from current missing data practices rival the biases from the much better known omitted variable problem. As it turns out, this discrepancy is not entirely our fault, as the computational algorithms used to apply the best multiple imputation models have been slow, di cult to implement, impossible to run with existing commercial statistical packages, and demanding of considerable expertise on the part of the user (even experts disagree on how to use them). In this paper, we adapt an existing algorithm, and use it to implement a generalpurpose, multiple imputation model for missing data. This algorithm is between 65 and

### Citations

1276 | Statistical analysis with missing data - Little, Rubin - 1987 |

768 |
Multiple Imputation for Nonresponse in Surveys
- Rubin
- 1987
(Show Context)
Citation Context ...us imputation and 0 otherwise. 5.3 Clari cations and Common Misconceptions Multiple imputation inferences have been shown to be statistically valid from both a Bayesian and a frequentist perspective (=-=Rubin 1987-=-; Schenker and Welsh 1988; Brownstone 1991; Meng 1994; Rubin 1996; Schafer 1997). Since there is some controversy over the strength and applicability of the assumptions involved from a frequentist per... |

578 | The calculation of posterior distributions by data augmentation - Tanner, Wong - 1987 |

571 |
Advanced Econometrics
- Amemiya
- 1985
(Show Context)
Citation Context ...en come from economics or biostatistics and usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; Heckman, 1976; =-=Amemiya, 1985-=-: chapter 10; King, 1989: chapter 7; Winship and Mare, 1992). This approach explicitly models missingness M simultaneously with the outcome D. Such models have the advantage of including the maximum i... |

426 |
The common structure of statistical models of truncation, sample selection, and limited dependent variable models and a simple estimator for such models
- Heckman
- 1976
(Show Context)
Citation Context ... approaches often come from economics or biostatistics and usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; =-=Heckman, 1976-=-; Amemiya, 1985: chapter 10; King, 1989: chapter 7; Winship and Mare, 1992). This approach explicitly models missingness M simultaneously with the outcome D. Such models have the advantage of includin... |

248 | Markov chain Monte Carlo convergence diagnostics: a comparative review
- COWLES, CARLIN
- 1996
(Show Context)
Citation Context ...can be assumed to come from the posterior distribution. Unfortunately, there is considerable disagreement within the statistics literature on how to assess convergence of this and other MCMC methods (=-=Cowles and Carlin, 1996-=-; Kass et al., 1997). For multiple imputation problems, we have the additional requirement that the draws we use for imputations must be statistically independent, which is not a characteristic of suc... |

223 | Making the Most of Statistical Analyses: Improving Interpretation and Presentation
- King, Tomz, et al.
- 2000
(Show Context)
Citation Context ...). If, instead of point estimates and standard errors, simulations of q are desired, we create 1=mth the needed number of simulations from each completed data set (following the usual procedures; see =-=King, Tomz, and Wittenberg, 1998-=-) and combine them into one set of simulations. Most of the statistical procedures used to create multiple imputations assume that the data are MAR, conditional on the imputation model. Proponents cla... |

201 |
Covariance Structure of the Gibbs Sampler with Applications to the Comparisons of Estimators and Augmentation Schemes
- Liu, Wong, et al.
- 1994
(Show Context)
Citation Context ...d with rows as observations: D = fY;Xg. If D were entirely observed, 3 Public domain software accompanying Schafer's (1997) superb book implements monotone data augmentation (Rubin and Schafer, 1990; =-=Liu, Wong, and Kong, 1994-=-), the best available approach presently. The commercial programs Solas and SPlus have also promised implementations. SPSS recently released a missing data module that allows several types of imputati... |

172 |
Multiple Imputation after 18 years
- Rubin
- 1996
(Show Context)
Citation Context ...nceptions Multiple imputation inferences have been shown to be statistically valid from both a Bayesian and a frequentist perspective (Rubin 1987; Schenker and Welsh 1988; Brownstone 1991; Meng 1994; =-=Rubin 1996-=-; Schafer 1997). Since there is some controversy over the strength and applicability of the assumptions involved from a frequentist perspective, we focus on the far simpler Bayesian version. This vers... |

168 |
A Monte carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms
- Wei, Tanner
- 1990
(Show Context)
Citation Context ...ce sampling (or \sampling importance/resampling"), an iterative simulation technique not based on Markov chains, to get the best of both worlds (Rubin, 1987: 192{4; Tanner, 1996; Gelman et al., 1996; =-=Wei and Tanner, 1990-=-). EMis (EM with importance sampling) follows the same steps as EMs except that draws of from its asymptotic distribution are treated only as approximations to the true ( nite sample) posterior distri... |

144 |
Tools for statistical inference : methods for the exploration of posterior distributions and likelihood functions
- Tanner
- 1996
(Show Context)
Citation Context ...bution of Dmis. The problem is that the posterior distribution of and is not easy to draw from. We solve this problem in two di erent ways. In this section, we use the asymptotic approximation (e.g., =-=Tanner, 1996-=-: 54{59), which we nd works as expected | well in large data sets due to the central limit theorem and poorly in small ones. To create multiple imputations with this method, which we denote EMs (EM wi... |

83 | A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data - King - 1997 |

75 | Markov Chain Monte Carlo in Practice: A Roundtable Discussion - Kass, Carlin, et al. - 1998 |

67 |
Unifying Political Methodology: The Likelihood Theory of Statistical Inference
- King
- 1998
(Show Context)
Citation Context ...iostatistics and usually assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; Heckman, 1976; Amemiya, 1985: chapter 10; =-=King, 1989-=-: chapter 7; Winship and Mare, 1992). This approach explicitly models missingness M simultaneously with the outcome D. Such models have the advantage of including the maximum information in the estima... |

57 |
The Statistical Analysis of QuasiExperiments
- Achen
- 1987
(Show Context)
Citation Context ...uncation models; see Section 4). When possible, it is best to adapt one's statistical model specially to deal with missing data, as suggested by the two superb political science books on the subject (=-=Achen, 1986-=-; Brehm, 1993). Unfortunately, doing so in some situations puts heavy burdens on the investigator since optimal models for missing data are highly specialized and so often require unfamiliar methods t... |

57 | Multiple Imputation for multivariate Missing-Data Problems: A Data Analyst’s Perspective.” Multivariate Behavioral Research 33(4):545–571
- Schafer, Olsen
- 1998
(Show Context)
Citation Context ...et researchers have frequently found it to work as well as much more complicated alternatives specially designed for categorical or mixed data (Ezzati-Rice et al., 1995; Graham and Schafer, in press; =-=Schafer and Olsen, 1998-=-; Schafer, 1997; Rubin and Schenker, 1986; Schafer et al., 1996). For our purposes, if there exists information in observed data that can be used to predict the missing data, multiple imputations from... |

44 |
Multiple-imputation inferences with uncongenial sources of input (disc
- Meng
- 1994
(Show Context)
Citation Context ...ommon Misconceptions Multiple imputation inferences have been shown to be statistically valid from both a Bayesian and a frequentist perspective (Rubin 1987; Schenker and Welsh 1988; Brownstone 1991; =-=Meng 1994-=-; Rubin 1996; Schafer 1997). Since there is some controversy over the strength and applicability of the assumptions involved from a frequentist perspective, we focus on the far simpler Bayesian versio... |

33 | A statistical model for multiparty electoral data - Katz, King - 1999 |

26 | On Variance Estimation with Imputed Survey Data - Rao - 1996 |

24 |
The Phantom Respondents: Opinion Surveys and Political Representation. Ann Arbor
- Brehm
- 1993
(Show Context)
Citation Context ...ls; see Section 4). When possible, it is best to adapt one's statistical model specially to deal with missing data, as suggested by the two superb political science books on the subject (Achen, 1986; =-=Brehm, 1993-=-). Unfortunately, doing so in some situations puts heavy burdens on the investigator since optimal models for missing data are highly specialized and so often require unfamiliar methods that di er wit... |

17 |
A Split Questionnaire Survey Design
- Raghunathan, Grizzle
- 1995
(Show Context)
Citation Context ...ombination of item and unit nonresponse. Some examples include entire variables missing from one of a series of cross-sectional surveys (Franklin, 1989; Gelman, King, and Liu, 1998), matrix sampling (=-=Raghunathan and Grizzle, 1995-=-), panel attrition, etc. 1now the choice of most statisticians at least in principle, but they have not made it into the toolbox of more than a few statisticians or social scientists. The problem is ... |

13 |
Models for Sample Selection Bias
- Winship, Mare
- 1992
(Show Context)
Citation Context ...y assume MAR or NI. The most common examples are models for selection bias, such as truncation or censoring (Achen, 1986; Brehm, 1993; Heckman, 1976; Amemiya, 1985: chapter 10; King, 1989: chapter 7; =-=Winship and Mare, 1992-=-). This approach explicitly models missingness M simultaneously with the outcome D. Such models have the advantage of including the maximum information in the estimation process. Indeed, NI problems a... |

12 | Structure, behavior, and voter turnout in the United States - Timpone - 1998 |

11 | Are Americans Ambivalent Towards Racial Policies - Alvarez, Brehm - 1997 |

11 | Partisan cues and the media: Information flows in the 1992 presidential election - Dalton, Beck, et al. - 1998 |

10 |
Theory testing in a world of constrained research design
- Stolzenberg, Relles
- 1990
(Show Context)
Citation Context ...ptions apply, application-speci c approaches are maximally e cient. However, inferences about the quantities of interest from these models tend to be fairly sensitive tosmall changes in speci cation (=-=Stolzenberg and Relles, 1990-=-). Moreover, no single application-speci c model works well across applications; instead, a di erent model must be used for each type of application. As a result, when applied to new types of data set... |

7 |
When are inferences from multiple imputation valid
- Fay
- 1992
(Show Context)
Citation Context ...ables (and information) used in the analysis model, no bias is introduced and nominal con dence interval coverage will be at least as great as actual coverage, and equal when the two stages coincide (=-=Fay, 1992-=-; Rubin, 1996). When the imputation model includes more information than the analysis model, multiple imputation is more e cient than even the corresponding \optimal" application-speci c method. Thus,... |

6 | A simulation study to evaluate the performance of model-based multiple imputations in NCHS health examination surveys - Little, Ezzati-Rice, et al. - 1995 |

5 | Multiple imputation of missing data - Schafer, Khare, et al. - 1993 |

4 |
What We Know About 'Don't Knows': An Analysis of Seven Point Issue Placements." Paper presented at the poster session
- Globetti
- 1997
(Show Context)
Citation Context ...terest. Of course, because this result relies on the optimistic MCAR assumption, the degree of error will be more than a standard error in most real analyses, and it will not be in random directions (=-=Globetti, 1997-=-; Sherman, 1998). The actual case, rather than this \best" case, would seem to be a surprisingly serious problem. 4 This is one of the infeasible estimator's standard errors, which is equivalent to 71... |

4 | Voting, Abstention, and Individual Expectations in the 1992 Presidential Election.” Working Paper, Northwestern University. 44 - Herron - 1998 |

3 | Panel Attrition and Panel Conditioning in American National Election Studies" paper prepared for the 1998 meetings of the Society for Political Methodology - Bartels - 1998 |

3 |
Multiple Imputation for Interval Estimation From Single Random Samples With Ignorable Nonresponse
- Rubin, Schenker
- 1986
(Show Context)
Citation Context ...to work as well as much more complicated alternatives specially designed for categorical or mixed data (Ezzati-Rice et al., 1995; Graham and Schafer, in press; Schafer and Olsen, 1998; Schafer, 1997; =-=Rubin and Schenker, 1986-=-; Schafer et al., 1996). For our purposes, if there exists information in observed data that can be used to predict the missing data, multiple imputations from this normal model will almost always dom... |

3 | Interstate Competition and State Strategies to Deregulate Interstate Banking 1982-1988 - Skalaban - 1992 |

2 | Estimation across data sets: two-stage auxiliary instrumental variables estimation (2SAIV)," Political Analysis 1 - unknown authors - 1989 |

2 | A uni ed model of cabinet dissolution in parliamentary democracies - King, Alt, et al. - 1990 |

1 |
Missing Data: A Review of the Literature," Pp. 415{494
- Anderson, Basilevsky, et al.
- 1983
(Show Context)
Citation Context ...ight each be MAR or nonignorable, but they are not MCAR. Listwise deletion can result in drastically changed magnitudes or incorrect signs of the estimates of causal e ects or descriptive inferences (=-=Anderson et al., 1983-=-). Listwise deletion will not always have such harmful e ects; sometimes the fraction of missing observations will be small, and sometimes the assumptions will hold su ciently well so that the bias is... |

1 |
Uninformed Votes: Information E ects in Presidential Elections
- Bartels
- 1996
(Show Context)
Citation Context ...t but are unobserved, although imputing values that the respondent really does not know can be of interest in speci c applications, such as nding out how people would vote if they were more informed (=-=Bartels, 1996-=-). Finally, let Dobs and Dmis denote elements of D that are observed and missing, respectively, soD = fDobs;Dmisg. Unfortunately, standard terminology describing possible missingness assumptions is un... |

1 | Transitional Citizenship: Voting in Post-Soviet Russia - Colton - 1998 |

1 |
Formalizing Subjective Notions about the E ect of Nonrespondents in Sample Surveys
- Rubin
- 1977
(Show Context)
Citation Context ...p answers in combination with listwise deletion to another method based on the concept of \multiple imputation" that is nearly as easy to use but avoids the statistical problems of current practices (=-=Rubin, 1977-=-). Multiple imputation methods have been around for about two decades, and are 1 The numbers in this paragraph come from our content analysis of the last ve years (1993{97) of the American Political S... |

1 |
E ciently Creating Multiple Imputations for In30 Multivariate Normal Data
- Rubin, Schafer
- 1990
(Show Context)
Citation Context ...planatory variables X, and with rows as observations: D = fY;Xg. If D were entirely observed, 3 Public domain software accompanying Schafer's (1997) superb book implements monotone data augmentation (=-=Rubin and Schafer, 1990-=-; Liu, Wong, and Kong, 1994), the best available approach presently. The commercial programs Solas and SPlus have also promised implementations. SPSS recently released a missing data module that allow... |

1 |
Analysis and Simulation of Incomplete Multivariate Data: Algorithms and Examples
- Schafer
- 1997
(Show Context)
Citation Context ...ently found it to work as well as much more complicated alternatives specially designed for categorical or mixed data (Ezzati-Rice et al., 1995; Graham and Schafer, in press; Schafer and Olsen, 1998; =-=Schafer, 1997-=-; Rubin and Schenker, 1986; Schafer et al., 1996). For our purposes, if there exists information in observed data that can be used to predict the missing data, multiple imputations from this normal mo... |

1 |
A Test of the Validity ofComplete-Unit Analysis in Surveys Subject to Item Nonresponse or Attrition," manuscript, Caltech
- Sherman
- 1998
(Show Context)
Citation Context ...e, because this result relies on the optimistic MCAR assumption, the degree of error will be more than a standard error in most real analyses, and it will not be in random directions (Globetti, 1997; =-=Sherman, 1998-=-). The actual case, rather than this \best" case, would seem to be a surprisingly serious problem. 4 This is one of the infeasible estimator's standard errors, which is equivalent to 71% of the listwi... |