Results 1  10
of
81
Missing data: Our view of the state of the art
 Psychological Methods
, 2002
"... Statistical procedures for missing data have vastly improved, yet misconception and unsound practice still abound. The authors frame the missingdata problem, review methods, offer advice, and raise issues that remain unresolved. They clear up common misunderstandings regarding the missing at random ..."
Abstract

Cited by 689 (1 self)
 Add to MetaCart
Statistical procedures for missing data have vastly improved, yet misconception and unsound practice still abound. The authors frame the missingdata problem, review methods, offer advice, and raise issues that remain unresolved. They clear up common misunderstandings regarding the missing at random (MAR) concept. They summarize the evidence against older procedures and, with few exceptions, discourage their use. They present, in both technical and practical language, 2 general approaches that come highly recommended: maximum likelihood (ML) and Bayesian multiple imputation (MI). Newer developments are discussed, including some for dealing with missing data that are not MAR. Although not yet in the mainstream, these procedures may eventually extend the ML and MI methods that currently represent the state of the art. Why do missing data create such difficulty in scientific research? Because most data analysis procedures were not designed for them. Missingness is usually a nuisance, not the main focus of inquiry, but
Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation
 American Political Science Review
, 2000
"... We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scatter ..."
Abstract

Cited by 389 (49 self)
 Add to MetaCart
(Show Context)
We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one's explanatory and dependent variables than the methods currently used in applied data analysis. The reason for this discrepancy lies with the fact that the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and demanding of considerable expertise. In this paper, we adapt an existing algorithm, and use it to implement a generalpurpose, multiple imputation model for missing data. This algorithm is considerably faster and easier to use than the leading method recommended in the statistics literature. We also quantify the risks of current missing data practices, ...
What to do about missing values in time series crosssection data
, 2009
"... Applications of modern methods for analyzing data with missing values, based primarily on multiple imputation, have in the last halfdecade become common in American politics and political behavior. Scholars in this subset of political science have thus increasingly avoided the biases and inefficien ..."
Abstract

Cited by 50 (6 self)
 Add to MetaCart
(Show Context)
Applications of modern methods for analyzing data with missing values, based primarily on multiple imputation, have in the last halfdecade become common in American politics and political behavior. Scholars in this subset of political science have thus increasingly avoided the biases and inefficiencies caused by ad hoc methods like listwise deletion and best guess imputation. However, researchers in much of comparative politics and international relations, and others with similar data, have been unable to do the same because the best available imputation methods work poorly with the timeseries cross section data structures common in these fields. Weattempttorectify this situation with three related developments. First, we build a multiple imputation model that allows smooth time trends, shifts across crosssectional units, and correlations over time and space, resulting in far more accurate imputations. Second, we enable analysts to incorporate knowledge from area studies experts via priors on individual missing cell values, rather than on difficulttointerpret model parameters. Third, because these tasks could not be accomplished within existing imputation algorithms, in that they cannot handle as many variables as needed even in the simpler crosssectional data for which they were designed, we also develop a new algorithm that substantially expands the range of computationally feasible data types and sizes for which multiple imputation can be used. These developments also make it possible to implement the methods introduced here in freely available open source software that is considerably more reliable than existing algorithms. We develop an approach to analyzing data with
Not Asked Or Not Answered: Multiple Imputation for Multiple Surveys
 Journal of the American Statistical Association
, 1998
"... We present a method of analyzing a series of independent crosssectional surveys in which some questions are not answered in some surveys and some respondents do not answer some of the questions posed. The method is also applicable to a single survey in which different questions are asked, or differ ..."
Abstract

Cited by 42 (9 self)
 Add to MetaCart
(Show Context)
We present a method of analyzing a series of independent crosssectional surveys in which some questions are not answered in some surveys and some respondents do not answer some of the questions posed. The method is also applicable to a single survey in which different questions are asked, or different sampling methods used, in different strata or clusters. Our method involves multiplyimputing the missing items and questions by adding to existing methods of imputation designed for single surveys a hierarchical regression model that allows covariates at the individual and survey levels. Information from survey weights is exploited by including in the analysis the variables on which the weights were based, and then reweighting individual responses (observed and imputed) to estimate population quantities. We also develop diagnostics for checking the fit of the imputation model based on comparing imputed to nonimputed data. We illustrate with the example that motivated this project  a ...
Simultaneous use of multiple imputation for missing data and disclosure limitation. Survey Methodol
, 2004
"... Several statistical agencies use, or are considering the use of, multiple imputation to limit the risk of disclosing respondents ' identities or sensitive attributes in public use data les. For example, agencies can release partially synthetic datasets, comprising the units originally surveyed ..."
Abstract

Cited by 32 (13 self)
 Add to MetaCart
(Show Context)
Several statistical agencies use, or are considering the use of, multiple imputation to limit the risk of disclosing respondents ' identities or sensitive attributes in public use data les. For example, agencies can release partially synthetic datasets, comprising the units originally surveyed with some collected values, such as sensitive values at high risk of disclosure or values of key identiers, replaced with multiple imputations. This article presents an approach for generating multiplyimputed, partially synthetic datasets that simultaneously handles disclosure limitation and missing data. The basic idea is to ll in the missing data rst to generate m completed datasets, then replace sensitive or identifying values in each completed dataset with r imputed values. This article also develops methods for obtaining valid inferences from such multiplyimputed datasets. New rules for combining the multiple point and variance estimates are needed because the double duty of multiple imputation introduces two sources of variability into point estimates, which existing methods for obtaining inferences from multiplyimputed datasets do not measure accurately. A reference tdistribution appropriate for inferences when m and r are moderate is derived using moment matching and Taylor series approximations.
Listwise deletion is evil: What to do about missing data in political science
 Paper Presented at the Annual Meeting of the American Political Science Association
, 1998
"... We propose a remedy to the substantial discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. With a few notable exceptions, statisticians and methodologists have agreed on a widely applicable approach to many missing da ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
(Show Context)
We propose a remedy to the substantial discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. With a few notable exceptions, statisticians and methodologists have agreed on a widely applicable approach to many missing data problems based on the concept of \multiple imputation, " but most researchers in our eld and other social sciences still use far inferior methods. Indeed, we demonstrate that the threats to validity from current missing data practices rival the biases from the much better known omitted variable problem. As it turns out, this discrepancy is not entirely our fault, as the computational algorithms used to apply the best multiple imputation models have been slow, di cult to implement, impossible to run with existing commercial statistical packages, and demanding of considerable expertise on the part of the user (even experts disagree on how to use them). In this paper, we adapt an existing algorithm, and use it to implement a generalpurpose, multiple imputation model for missing data. This algorithm is between 65 and
The Multiple Adaptations of Multiple Imputation
"... Multiple imputation was first conceived as a tool that statistical agencies could use to handle nonresponse in large sample, public use surveys. In the last two decades, the multiple imputation framework has been adapted for other statistical contexts. As examples, individual researchers use multipl ..."
Abstract

Cited by 26 (12 self)
 Add to MetaCart
Multiple imputation was first conceived as a tool that statistical agencies could use to handle nonresponse in large sample, public use surveys. In the last two decades, the multiple imputation framework has been adapted for other statistical contexts. As examples, individual researchers use multiple imputation to handle missing data in small samples; statistical agencies disseminate multiplyimputed datasets for purposes of protecting data confidentiality; and, survey methodologists and epidemiologists use multiple imputation to correct for measurement errors. In some of these settings, Rubin’s original rules for combining the point and variance estimates from the multiplyimputed datasets are not appropriate, because what is known—and therefore in the conditional expectations and variances used to derive inferential methods—differs from the missing data context. methods of inference. These applications require new combining rules and In fact, more than ten combining rules exist in the
Multiple imputation in multivariate problems when the imputation and the analysis models differ
, 2001
"... Bayesian multiple imputation (MI) has become a highly useful paradigm for handling missing values in many settings. In this paper, I compare Bayesian MI with other methods – maximum likelihood, in particular—and point out some of its unique features. One key aspect of MI, the separation of the imput ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
Bayesian multiple imputation (MI) has become a highly useful paradigm for handling missing values in many settings. In this paper, I compare Bayesian MI with other methods – maximum likelihood, in particular—and point out some of its unique features. One key aspect of MI, the separation of the imputation phase from the analysis phase, can be advantageous in settings where the models underlying the two phases do not agree. Key Words and Pharases: missing data, nonresponse. 1
Multiple imputation for model checking: Completeddata plots with missing and latent data
 Biometrics
, 2005
"... Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
(Show Context)
Summary. In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset—corresponding to the observed data and imputed unobserved data—using standard procedures for completedata inference. Here, we extend this approach to model checking by demonstrating the advantages of the use of completeddata model diagnostics on imputed completed datasets. The approach is set in the theoretical framework of Bayesian posterior predictive checks (but, as with missingdata imputation, our methods of missingdata model checking can also be interpreted as “predictive inference ” in a nonBayesian context). We consider the graphical diagnostics within this framework. Advantages of the completeddata approach include: (1) One can often check model fit in terms of quantities that are of key substantive interest in a natural way, which is not always possible using observed data alone. (2) In problems with missing data, checks may be devised that do not require to model the missingness or inclusion mechanism; the latter is useful for the analysis of ignorable but unknown data collection mechanisms, such as are often assumed in the analysis of sample surveys and observational studies. (3) In many problems with latent data, it is possible to check qualitative features of the model (for example, independence of two variables) that can be naturally formalized with the help of the latent data. We illustrate with several applied examples.
Analyzing the changing gender wage gap based on multiply imputed right censored wages
 IAB Discussion Paper
, 2005
"... Die ZBW räumt Ihnen als Nutzerin/Nutzer das unentgeltliche, räumlich unbeschränkte und zeitlich auf die Dauer des Schutzrechts beschränkte einfache Recht ein, das ausgewählte Werk im Rahmen der unter ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
Die ZBW räumt Ihnen als Nutzerin/Nutzer das unentgeltliche, räumlich unbeschränkte und zeitlich auf die Dauer des Schutzrechts beschränkte einfache Recht ein, das ausgewählte Werk im Rahmen der unter