## Correcting for Survey Misreports Using Auxiliary Information with an Application to Estimating Turnout (2009)

### Cached

### Download Links

Citations: | 5 - 0 self |

### BibTeX

@MISC{Katz09correctingfor,

author = {Jonathan N. Katz and Gabriel Katz},

title = {Correcting for Survey Misreports Using Auxiliary Information with an Application to Estimating Turnout},

year = {2009}

}

### OpenURL

### Abstract

Misreporting is a problem that plagues researchers that use survey data. In this paper, we develop a parametric model that corrects for misclassified binary responses using information on the misreporting patterns obtained from auxiliary data sources. The model is implemented within the Bayesian framework via Markov Chain Monte Carlo (MCMC) methods, and can be easily extended to address other problems exhibited by survey data, such as missing response and/or covariate values. While the model is fully general, we illustrate its application in the context of estimating models of turnout using data from the American National Elections Studies.

### Citations

1526 |
Statistical analysis with missing data
- Little, Rubin
- 1987
(Show Context)
Citation Context ...s the data are missing completely at random (MCAR), using list-wise deletion and restricting the analysis only to those respondents who are completely observed can lead to biased parameter estimates (=-=Little and Rubin 2002-=-).14 Furthermore, even if the data areMCAR, complete-case analysesmay lead to discarding a large proportion of observations and can be therefore quite inefficient (Ibrahim et al. 2005). While several ... |

1007 | Monte Carlo Statistical Methods
- Robert, Casella
- 1999
(Show Context)
Citation Context ... 2 (Gelfland and Smith 1990). Under mild regularity conditions, for a sufficiently large number of iterations, samples from these conditional distributions approach samples from the joint posterior (=-=Robert and Casella 2004-=-). The posterior marginals obtained from these convergent samples can then be summarized and used to estimate the effect of the relevant individual characteristics on the true response and the misrepo... |

876 |
Sampling-Based Approaches to Calculating Margianl Densities
- Gelfand, Smith
- 1990
(Show Context)
Citation Context ...him and Chen 2000).12 Although equation (8) is intractable analytically, inference can be performed using Gibbs sampling along with Metropolis steps to sample the full conditionals for , 1, and 2 (=-=Gelfland and Smith 1990-=-). Under mild regularity conditions, for a sufficiently large number of iterations, samples from these conditional distributions approach samples from the joint posterior (Robert and Casella 2004). Th... |

626 | Markov chain Monte Carlo in practice - Gilks, Richardson, et al. - 1996 |

589 | Applied Nonparametric Regression - Härdle - 1990 |

521 |
Analysis of Incomplete Multivariate Data
- Schafer
- 1997
(Show Context)
Citation Context ... Bernoulli distributions for the dichotomous variables, rather than having to specify a multivariate normal distribution for all covariates, as is generally the case with other imputation procedures (=-=Schafer 1997-=-). In our application, probit regression models were specified for all the dichotomous covariates in the model – Non-white, Own Home, Unemployed, Alone –, while t he remaining categorical covariates w... |

512 |
Bayesian analysis of binary and polychotomous response data
- Albert, Chib
- 1993
(Show Context)
Citation Context ...his problem could be by using a Bayesian approach based on Gibbs sampling, which allows obtaining arbitrarily precise approximations to the posterior densities without relying on large-sample theory (=-=Albert and Chib 1993-=-). 1.1. An illustrative example Thus, we know the conditions under which misreporting will be a problem theoretically: when the probability of misreporting varies systematically with characteristics w... |

471 | Understanding the MetropolisHastings algorithm - Chib, Greenberg - 1995 |

361 |
Data Analysis Using Regression and Multilevel/Hierarchical Models
- Gelman, Hill
- 2007
(Show Context)
Citation Context ... used to summarize the posterior distributions of the model’s coefficients and to compute the marginal effects of the regressors on the probability of voting through “average predictive comparisons” (=-=Gelman and Hill 2007-=-). Thus, we need only to have done a validation study at some point in order to account for misreporting in our model of voter turnout for 1994, although we must maintain the assumption that the proce... |

174 | 2000): “Measurement Error in Survey Data - Bound, Brown, et al. |

157 | Measurement Error in Nonlinear Models: A - Carroll, Ruppert, et al. - 2006 |

142 |
Who Votes
- Wolfinger, Rosenstone
- 1980
(Show Context)
Citation Context ...x. We should note that, while this specification includes some of the variablesmost commonly used inmodels of voter turnout found in the literature (Bernstein, Chadha, and Montjoy 2001; Highton 2004; =-=Wolfinger and Rosenstone 1980-=-), it does not examine the effect of other factors we might plausibly believe could alter turnout, such as political information (Alvarez 1997) or differences in state-level ballot laws (Wolfinger and... |

122 | Rational Choice and Turnout - Aldrich - 1993 |

117 |
Bayesian Statistical Modelling
- Congdon
- 2001
(Show Context)
Citation Context ... (Rubin 1976).11 If only the response variable had missing data, we would just specify the model presented in Section 2 and draw a value for each missing value of ỹi based on its predictive density (=-=Congdon 2001-=-). 12 In our case, however, we also have missing values in most of the covariates, so an important issue is the specification of a pa rametric model for the missing covariates (Ibrahim, Lipsitz and Ch... |

113 | Estimation of regression coefficients when some regres22 - ROBINS, ROTNITZKY, et al. - 1994 |

103 | Semiparametric Analysis of Discrete Response: Asymptotic Properties of the Maximum Score Estimator - Manski - 1985 |

87 | Semiparametric Efficiency in Multivariate Regression Models with Missing Data - Robins, Rotnitzky - 1995 |

81 | Misclassification of a Dependent Variable in a Discrete Response - Hausman, Morton - 1994 |

69 | Regression with a Binary Independent Variable Subject to Errors of Observations - Aigner - 1973 |

61 |
Pattern-Mixture Models for Multivariate Incomplete Data
- Little
- 1993
(Show Context)
Citation Context ...that there are situations in which inference based on a completecase analysis might yield unbiased estimates and outperform imputation methods even when the data are not missing completely at random (=-=Little and Wang 1996-=-; Ibrahim et al. 2005). 10A detailed review of the different commonly used model-based imputation methods is beyond the scope of this paper. See Schafer and Graham (2002); Ibrahim et al. (2005); Horto... |

57 |
Explaining the Gibbs sampler. The American Statistician 46, 167-174. [An easy-to-read explanation of one of the most popular MCMC approaches.] Chib
- Casella, George
- 1992
(Show Context)
Citation Context ...ing Gibbs sampling to repeatedly draw samples from each unknown parameter’s full conditional posterior distribution in order to form the corresponding marginal distributions (Gelfland and Smith 1990; =-=Casella and George 1992-=-). While the corresponding conditional posterior densities have no closed forms (see Appendix B), draws of β, γ1 and γ2 can be obtained using Adaptive Rejection Sampling (ARS) (Gilks and Wild 1992). U... |

52 |
Information and Elections. Ann Arbor
- Alvarez
- 1997
(Show Context)
Citation Context ... Chadha, and Montjoy 2001; Highton 2004; Wolfinger and Rosenstone 1980), it does not examine the effect of other factors we might plausibly believe could alter turnout, such as political information (=-=Alvarez 1997-=-) or differences in state-level ballot laws (Wolfinger and Rosenstone 1980). The sample used in the analysis consists of 6,411 observations for the six elections under study and was constructed so tha... |

40 | Identification and Robustness with Contaminated and Corrupt Data - Horowitz, Manski - 1995 |

38 | Bounding Mean Regressions When A Binary Regressor is Mismeasured - Bollinger - 1996 |

38 | Effects of misspecification of the propensity score on estimators of treatment effect - Drake - 1993 |

31 | Inference and Missing Data. Biometrika 63:581592 - Rubin - 1976 |

25 | Not asked and not answered: multiple imputation for multiple surveys (with discussion - GELMAN, KING, et al. - 1998 |

25 | Unemployment Benefits and Labor Market Transitions - Poterba, Summers - 1995 |

23 | Scobit: an alternative estimator to logit and probit
- Nagler
- 1994
(Show Context)
Citation Context ...en characteristics of interest. This almost always leads to estimation of the common logit or probit models, since the turnout decision is dichotomous, although there are alternatives such as scobit (=-=Nagler 1994-=-) or non-parametric models (Härdle 1990) for discrete choice models. A problem arises because we do not (easily) observe the decision to vote because of the use of secret ballot in the U.S. Even if w... |

22 |
Semiparametric Estimation with Mismeasured Dependent Variables: An Application to Duration Models with Unemployment Spells,” Annales d’Economie et de Statistique
- Abrevaya, Hausman
- 1999
(Show Context)
Citation Context ...s, ourmodel might be quite sensitive to distributional and modeling assumptions.Although semiparametricmethodshave been used to estimate discrete choice models with misclassified dependent variables (=-=Abrevaya and Hausman 1999-=-; Hausman, Abrevaya, and Scott-Morton 1998), they are also subject to potential misspecification (Molinari 2003). A different approach would be to adapt and implement nonparametric methods based onHor... |

22 |
Adaptive rejection sampling for Gibbs sampling.” Applied Statistics
- Gilks, Wild
- 1992
(Show Context)
Citation Context ...Casella and George 1992). While the corresponding conditional posterior densities have no closed forms (see Appendix B), draws of β, γ1 and γ2 can be obtained using Adaptive Rejection Sampling (ARS) (=-=Gilks and Wild 1992-=-). Under mild regularity conditions (Gilks, Richardson and Spiegelhalter 1996), for a sufficiently large number of iterations, samples from these complete conditionals approach samples from the margin... |

21 |
Measurement and Mismeasurement of the Validity of Self-Reported Vote
- Anderson, Silver
- 1986
(Show Context)
Citation Context ...d Claggett 1984, 1986, 1991; Hill and Hurley 1984; Katosh and Traugott 1981; Sigelman 1982; Silver, Anderson and Abramson 1986; Weir 1975) and even to a debate about how to best measure misreporting (=-=Anderson and Silver 1986-=-). All of these studies find that misreporting varies systematically with characteristics of interest, but none offers a complete characterization of when this misreporting will be problem for inferen... |

20 |
The Non-Voting Voter in Voting Research
- Sigelman
- 1982
(Show Context)
Citation Context ...ls. However, it has been long established that some survey respondents misreport voting, i.e., they report that they have voted when in fact they did not do so (Burden 2000; Katosh and Traugott 1981; =-=Sigelman 1982-=-). The evidence that misreporting is a problem can be found in a series of validation studies that the ANES conducted American Journal of Political Science, Vol. 54, No. 3, July 2010, Pp. 815–835 C©20... |

19 |
The Consequences of Validated and SelfReported Voting Measures.” Public Opinion Quarterly 45
- Katosh, Traugott
- 1981
(Show Context)
Citation Context ...) for discrete choice models. However, it has been long established that some survey respondents misreport voting, i.e., they report that they have voted when in fact they did not do so (Burden 2000; =-=Katosh and Traugott 1981-=-; Sigelman 1982). The evidence that misreporting is a problem can be found in a series of validation studies that the ANES conducted American Journal of Political Science, Vol. 54, No. 3, July 2010, P... |

18 |
Information and Election
- Alvarez
- 1997
(Show Context)
Citation Context ...hat while this specification is similar to most found in the literature, it does not examine other we might plausibly believe alter turnout behavior — for example, any role for political information (=-=Alvarez 1997-=-) or differences in state-level ballot laws Wolfinger and Rosenstone (1980). The samples used in the analysis consists of 6452 observations for the 6 elections under study and were constructed so that... |

17 | Vote ‘Over’ Reporting - Presser, Traugott, et al. - 1990 |

16 |
Power prior distributions for regression models
- Ibrahim, Chen
- 2000
(Show Context)
Citation Context ...radigm provides a flexible framework for summarizing and integrating historical or supplementary evidence on misreport patterns from different sources and levels of analysis (Dunson and Tindall 2000; =-=Ibrahim and Chen 2000-=-; Prescott and Garthwaite 2002). Using Markov Chain Monte Carlo (MCMC) methods, the model presented here allows placing prior restrictions on the misclassification probabilities or on relevant regress... |

16 | Missing covariates in generalized linear models when the missing data mechanism is non-ignorable - Ibrahim, Lipsitz, et al. - 1999 |

15 | Response Validity: Vote Report.” Public Opinion Quarterly 32(4 - Clausen - 1968 |

15 |
Inference and missing data.” Biometrika
- Rubin
- 1976
(Show Context)
Citation Context ...that the data are missing at random (MAR) and that that the parameters of the missing-data process are distinct from the parameters of the data model, so that the missing-data mechanism is ignorable (=-=Rubin 1976-=-).11 If only the response variable had missing data, we would just specify the model presented in Section 2 and draw a value for each missing value of ỹi based on its predictive density (Congdon 2001... |

14 |
Semiparametric and Nonparametric Estimation of Quantal Response Models
- Horowitz
- 1993
(Show Context)
Citation Context ...nd Katz (2009), the estimates of 1 and 2 can be far away from the true coefficients when the model of misreporting is misspecified, particularly when the error terms are bimodal or heteroskedastic (=-=Horowitz 1993-=-; Zhao 2008). Nonetheless, the estimated covariate effects seem to be quite robust to the specification of the misreport model and much more accurate than those from standard parametric models when mi... |

14 |
Missing-Data Methods for Generalized Linear Models: A Comparative Review
- Ibrahim, Chen, et al.
- 2005
(Show Context)
Citation Context ...rate samples. Our approach also enables us to simultaneously address another important problem with survey data, namely missing outcome and/or covariate values, using Bayesian model-based imputation (=-=Ibrahim et al. 2005-=-). Compared to alternative imputation techniques, Bayesian methods allow easily estimating standard errors inmultiparameter problems and handling “nuisance” parameters and have been shown to be partic... |

13 | JW Missing data: Our view of the state of the art Psychological Methods 2002;7:147–177 - JL, Graham |

12 | ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models - Much |

11 | Using historical controls to adjust for covariates in trend tests for binary data - Ibrahim, Ryan, et al. - 1998 |

10 |
Voter Turnout and the National Election Studies." Political Analysis
- Burden
- 2000
(Show Context)
Citation Context ...(Härdle 1990) for discrete choice models. However, it has been long established that some survey respondents misreport voting, i.e., they report that they have voted when in fact they did not do so (=-=Burden 2000-=-; Katosh and Traugott 1981; Sigelman 1982). The evidence that misreporting is a problem can be found in a series of validation studies that the ANES conducted American Journal of Political Science, Vo... |

10 | Power prior distributions for generalized linear models - Chen, Ibrahim, et al. - 2000 |

10 |
A conditional model for incomplete covariates in parametric regression models
- LIPSITZ, IBRAHIM
- 1996
(Show Context)
Citation Context ... complete-case analysis, including in the sample only those respondents who are completely observed. This approach has serious drawbacks which have been extensively documented (Little and Rubin 1987; =-=Lipsitz and Ibrahim 1996-=-; Ibrahim et al. 2005). First, simply omitting missing data from the analysis leads to valid inferences if the data are missing completely at random. If, on the other hand, respondents with complete d... |

9 |
Nonvoters in Voters’ Clothing: The Impact of Voting Behavior Misreporting on Voting Behavior Research.” Social Science Quarterly 65
- Hill, Hurley
- 1984
(Show Context)
Citation Context ...spondents claiming to have voted did in fact not according to the public records. This finding lead to a cottage industry analyzing the causes of misreporting (Abramson and Claggett 1984, 1986, 1991; =-=Hill and Hurley 1984-=-; Katosh and Traugott 1981; Sigelman 1982; Silver, Anderson and Abramson 1986; Weir 1975) and even to a debate about how to best measure misreporting (Anderson and Silver 1986). All of these studies f... |

8 | Attempts to Improve the Accuracy of Self-Reports of Voting - Abelson, Loftus, et al. - 1992 |