Results 1  10
of
38
Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation
 American Political Science Review
, 2000
"... We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scatter ..."
Abstract

Cited by 237 (43 self)
 Add to MetaCart
We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one's explanatory and dependent variables than the methods currently used in applied data analysis. The reason for this discrepancy lies with the fact that the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and demanding of considerable expertise. In this paper, we adapt an existing algorithm, and use it to implement a generalpurpose, multiple imputation model for missing data. This algorithm is considerably faster and easier to use than the leading method recommended in the statistics literature. We also quantify the risks of current missing data practices, ...
Bayesian measures of model complexity and fit
 Journal of the Royal Statistical Society, Series B
, 2002
"... [Read before The Royal Statistical Society at a meeting organized by the Research ..."
Abstract

Cited by 203 (3 self)
 Add to MetaCart
[Read before The Royal Statistical Society at a meeting organized by the Research
A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology 27
, 2001
"... This article describes and evaluates a procedure for imputing missing values for a relatively complex data structure when the data are missing at random. The imputations are obtained by fitting a sequence of regression models and drawing values from the corresponding predictive distributions. The ty ..."
Abstract

Cited by 83 (5 self)
 Add to MetaCart
(Show Context)
This article describes and evaluates a procedure for imputing missing values for a relatively complex data structure when the data are missing at random. The imputations are obtained by fitting a sequence of regression models and drawing values from the corresponding predictive distributions. The types of regression models used are linear, logistic, Poisson, generalized logit or a mixture of these depending on the type of variable being imputed. Two additional common features in the imputation process are incorporated: restriction to a relevant subpopulation for some variables and logical bounds or constraints for the imputed values. The restrictions involve subsetting the sample individuals that satisfy certain criteria while fitting the regression models. The bounds involve drawing values from a truncated predictive distribution. The development of this method was partly motivated by the analysis of two data sets which are used as illustrations. The sequential regression procedure is applied to perform multiple imputation analysis for the two applied problems. The sampling properties of inferences from multiply imputed data sets created using the sequential regression method are evaluated through simulated data sets. Key Words: Item nonresponse ; Missing at random ; Multiple imputation ; Nonignorable missing mechanism; Regression ; Sampling properties and simulations.
Multiple Imputation in Practice: Comparison of Software Packages for Regression Models With Missing Variables
"... This article reviews multiple imputation, describes assumptions that it requires, and reviews software packages that implement this procedure. We apply the methods and compare the results using two examplesa child psychopathology dataset with missing outcomes and an artificial dataset with missin ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
This article reviews multiple imputation, describes assumptions that it requires, and reviews software packages that implement this procedure. We apply the methods and compare the results using two examplesa child psychopathology dataset with missing outcomes and an artificial dataset with missing covariates. We conclude with some discussion of the strengths and weaknesses of these implementations as well as advantages and limitations of imputation
MICE: Multivariate Imputation by Chained Equations in R
"... Multivariate Imputation by Chained Equations (MICE) is the name of software for imputing incomplete multivariate data by Fully Conditional Specification (FCS). MICE V1.0 appeared in the year 2000 as an SPLUS library, and in 2001 as an R package. MICE V1.0 introduced predictor selection, passive imp ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
Multivariate Imputation by Chained Equations (MICE) is the name of software for imputing incomplete multivariate data by Fully Conditional Specification (FCS). MICE V1.0 appeared in the year 2000 as an SPLUS library, and in 2001 as an R package. MICE V1.0 introduced predictor selection, passive imputation and automatic pooling. This article presents MICE V2.0, which extends the functionality of MICE V1.0 in several ways. In MICE V2.0, the analysis of imputed data is made completely general, whereas the range of models under which pooling works is substantially extended. MICE V2.0 adds new functionality for imputing multilevel data, automatic predictor selection, data handling, postprocessing imputed values, specialized pooling and model selection. Imputation of categorical data is improved in order to bypass problems caused by perfect prediction. Special attention to transformations, sum scores, indices and interactions using passive imputation, and to the proper setup of the predictor matrix. MICE V2.0 is freely available from CRAN as an R package mice. This article provides a handson, stepwise approach to using mice for solving incomplete data problems in real data.
The Multiple Adaptations of Multiple Imputation
"... Multiple imputation was first conceived as a tool that statistical agencies could use to handle nonresponse in large sample, public use surveys. In the last two decades, the multiple imputation framework has been adapted for other statistical contexts. As examples, individual researchers use multipl ..."
Abstract

Cited by 12 (7 self)
 Add to MetaCart
(Show Context)
Multiple imputation was first conceived as a tool that statistical agencies could use to handle nonresponse in large sample, public use surveys. In the last two decades, the multiple imputation framework has been adapted for other statistical contexts. As examples, individual researchers use multiple imputation to handle missing data in small samples; statistical agencies disseminate multiplyimputed datasets for purposes of protecting data confidentiality; and, survey methodologists and epidemiologists use multiple imputation to correct for measurement errors. In some of these settings, Rubin’s original rules for combining the point and variance estimates from the multiplyimputed datasets are not appropriate, because what is known—and therefore in the conditional expectations and variances used to derive inferential methods—differs from the missing data context. methods of inference. These applications require new combining rules and In fact, more than ten combining rules exist in the
Multiple Imputation of Missing Income Data in the National Health Interview Survey
 Journal of the American Statistical Association
"... The National Health Interview Survey (NHIS) provides a rich source of data for studying relationships between income and health and for monitoring health and health care for persons at different income levels. However, the nonresponse rates are high for two key items, total family income in the prev ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
The National Health Interview Survey (NHIS) provides a rich source of data for studying relationships between income and health and for monitoring health and health care for persons at different income levels. However, the nonresponse rates are high for two key items, total family income in the previous calendar year and personal earnings from employment in the previous calendar year. To handle the missing data on family income and personal earnings in the NHIS, multiple imputation of these items, along with employment status and ratio of family income to the federal poverty threshold (derived from the imputed values of family income), has been performed for the survey years 1997–2004. (There are plans to continue this work for years beyond 2004 as well.) Files of the imputed values, as well as documentation, are available at the NHIS website
Conducting tetrad tests of model fit and contrasts of tetradnested models: a new SAS macro
 Struct. Equ. Model
"... This article describes a SAS macro to assess model fit of structural equation models by employing a test of the modelimplied vanishing tetrads. Use of this test has been limited in the past, in part due to the lack of software that fully automates the test in a userfriendly way. The current SAS m ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
This article describes a SAS macro to assess model fit of structural equation models by employing a test of the modelimplied vanishing tetrads. Use of this test has been limited in the past, in part due to the lack of software that fully automates the test in a userfriendly way. The current SAS macro provides a straightforward method for researchers touse thevanishing tetrads impliedbymodels toassess the fitof (a) structural equation models containing continuous endogenous variables; (b) structural equation models containing continuous endogenous variables nested for vanishing tetrads; and (c) structural equation models containing dichotomous, ordinal, or censored endogenous variables. Besides providing an alternative assessment of model fit to the usual likelihoodratio test (LRT), thevanishing tetrads testoccasionallyprovidesastatistical assessment of competing models nested for vanishing tetrads but not nested for the LRT. The macro permits formal comparisons between tetradnested structural equation models containing dichotomous, ordinal, or censored endogenous variables. A key focus of structural equation modeling (SEM) is the assessment of model fit. The usual test applied for assessing model fit is the likelihoodratio chisquare test
A toolkit in SAS for the evaluation of multiple imputation methods
 Statistica Neerlandica
, 2003
"... This paper outlines a strategy to validate multiple imputation methods. Rubin’s criteria for proper multiple imputation are the point of departure. We describe a simulation method that yields insight into various aspects of bias and efficiency of the imputation process. We propose a new method for c ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
This paper outlines a strategy to validate multiple imputation methods. Rubin’s criteria for proper multiple imputation are the point of departure. We describe a simulation method that yields insight into various aspects of bias and efficiency of the imputation process. We propose a new method for creating incomplete data under a general Missing At Random (MAR) mechanism. Software implementing the validation strategy is available as a SAS/IML module. The method is applied to investigate the behavior of polytomous regression imputation for categorical data. Key Words and Phrases: multiple imputation, proper imputation, missing data mechanism, simulation.