Results 1  10
of
22
MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. Retrieved from: http://gking.harvard.edu/matchit/docs/matchit.pdf
 Statistics & Probability Letters
, 2009
"... MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007) for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of cau ..."
Abstract

Cited by 30 (8 self)
 Add to MetaCart
MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007) for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of causal inferences on hardtojustify, but commonly made, statistical modeling assumptions. The software also easily fits into existing research practices since, after preprocessing data with MatchIt, researchers can use whatever parametric model they would have used without MatchIt, but produce inferences with substantially more robustness and less sensitivity to modeling assumptions. MatchIt is an R program, and also works seamlessly with Zelig.
Origins of Homophily in an Evolving Social Network 1
"... The authors investigate the origins of homophily in a large university community, using network data in which interactions, attributes, and affiliations are all recorded over time. The analysis indicates that highly similar pairs do show greater than average propensity to form new ties; however, it ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
The authors investigate the origins of homophily in a large university community, using network data in which interactions, attributes, and affiliations are all recorded over time. The analysis indicates that highly similar pairs do show greater than average propensity to form new ties; however, it also finds that tie formation is heavily biased by triadic closure and focal closure, which effectively constrain the opportunities among which individuals may select. In the case of triadic closure, moreover, selection to “friend of a friend” status is determined by an analogous combination of individual preference and structural proximity. The authors conclude that the dynamic interplay of choice homophily and induced homophily, compounded over many “generations ” of biased selection of similar individuals to structurally proximate positions, can amplify even a modest preference for similar others, via a cumulative advantage– like process, to produce striking patterns of observed homophily.
MICE: Multivariate Imputation by Chained Equations in R
"... Multivariate Imputation by Chained Equations (MICE) is the name of software for imputing incomplete multivariate data by Fully Conditional Specification (FCS). MICE V1.0 appeared in the year 2000 as an SPLUS library, and in 2001 as an R package. MICE V1.0 introduced predictor selection, passive imp ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
Multivariate Imputation by Chained Equations (MICE) is the name of software for imputing incomplete multivariate data by Fully Conditional Specification (FCS). MICE V1.0 appeared in the year 2000 as an SPLUS library, and in 2001 as an R package. MICE V1.0 introduced predictor selection, passive imputation and automatic pooling. This article presents MICE V2.0, which extends the functionality of MICE V1.0 in several ways. In MICE V2.0, the analysis of imputed data is made completely general, whereas the range of models under which pooling works is substantially extended. MICE V2.0 adds new functionality for imputing multilevel data, automatic predictor selection, data handling, postprocessing imputed values, specialized pooling and model selection. Imputation of categorical data is improved in order to bypass problems caused by perfect prediction. Special attention to transformations, sum scores, indices and interactions using passive imputation, and to the proper setup of the predictor matrix. MICE V2.0 is freely available from CRAN as an R package mice. This article provides a handson, stepwise approach to using mice for solving incomplete data problems in real data.
The VGAM Package for Categorical Data Analysis
"... Classical categorical regression models such as the multinomial logit and proportional odds models are shown to be readily handled by the vector generalized linear and additive model (VGLM/VGAM) framework. Additionally, there are natural extensions, such as reducedrank VGLMs for dimension reduction ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Classical categorical regression models such as the multinomial logit and proportional odds models are shown to be readily handled by the vector generalized linear and additive model (VGLM/VGAM) framework. Additionally, there are natural extensions, such as reducedrank VGLMs for dimension reduction, and allowing covariates that have values specific to each linear/additive predictor, e.g., for consumer choice modeling. This article describes some of the framework behind the VGAM R package, its usage and implementation details.
How robust standard errors expose methodological problems they do not fix. Working paper
, 2012
"... “Robust standard errors ” are used in a vast array of scholarship to correct standard errors for model misspecification. However, when misspecification is bad enough to make classical and robust standard errors diverge, assuming that it is nevertheless not so bad as to bias everything else requires ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
“Robust standard errors ” are used in a vast array of scholarship to correct standard errors for model misspecification. However, when misspecification is bad enough to make classical and robust standard errors diverge, assuming that it is nevertheless not so bad as to bias everything else requires considerable optimism. And even if the optimism is warranted, we show that settling for a misspecified model will still bias estimators of all but a few quantities of interest. We suggest instead that robust and classical standard error differences be treated like canaries in the coal mine, providing clues about model misspecification and likely biases. At that point, we can use standard model checking diagnostics to find the problem and modern approaches to choosing a better model. With several simulations and real examples, we demonstrate that following these procedures can drastically reduce biases, improve statistical inferences, and change substantive conclusions.
Multiple Overimputation: A Unified Approach to Measurement Error and Missing Data ∗
, 2010
"... Social scientists typically devote considerable effort to mitigating measurement error during data collection but then ignore the issue during data analysis. Although many statistical methods have been proposed for reducing measurement errorinduced biases, few have been widely used because of impla ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Social scientists typically devote considerable effort to mitigating measurement error during data collection but then ignore the issue during data analysis. Although many statistical methods have been proposed for reducing measurement errorinduced biases, few have been widely used because of implausible assumptions, high levels of model dependence, difficult computation, or inapplicability with multiple mismeasured variables. We develop an easytouse alternative without these problems; it generalizes the popular multiple imputation (mi) framework by treating missing data problems as a special case of extreme measurement error and corrects for both. Like mi, the proposed “multiple overimputation ” (mo) framework is a simple twostep procedure. First, multiple ( ≈ 5) completed copies of the data set are created where cells measured without error are held constant, those missing are imputed from the distribution of predicted values, and cells (or entire variables) with measurement error are “overimputed,” that is imputed from the predictive distribution with observationlevel priors defined by the mismeasured values and available external information, if any. In the
A MultiParadigm Modeling Definition and Language
, 2013
"... This paper discusses a single form for statistical models that accommodates a broad range of models, from ordinary least squares to agentbased microsimulations. The definition makes it almost trivial to define morphisms to transform and combine existing models to produce new models. It offers a uni ..."
Abstract
 Add to MetaCart
This paper discusses a single form for statistical models that accommodates a broad range of models, from ordinary least squares to agentbased microsimulations. The definition makes it almost trivial to define morphisms to transform and combine existing models to produce new models. It offers a unified means of expressing and implementing methods that are typically given disparate treatment in the literature, including Jacobian transformations, Bayesian updating, multilevel models, some missing data imputation methods, approaches to dealing with nuisance parameters, and several other common procedures. It especially offers benefit to simulationtype models, because of the value in being able to easily calculate robustness measures for simulation statistics and, where appropriate, test hypotheses. Running examples will be given using Apophenia, a software library based largely on the model form and transformations described here. 1
Reviewed by:
, 2012
"... doi: 10.3389/fgene.2012.00001 Moving toward system genetics through multiple trait analysis in genomewide association studies ..."
Abstract
 Add to MetaCart
doi: 10.3389/fgene.2012.00001 Moving toward system genetics through multiple trait analysis in genomewide association studies
Amelia II: A Program . . .
, 2011
"... Amelia II is a complete R package for multiple imputation of missing data. The package implements a new expectationmaximization with bootstrapping algorithm that works faster, with larger numbers of variables, and is far easier to use, than various Markov chain Monte Carlo approaches, but gives ess ..."
Abstract
 Add to MetaCart
Amelia II is a complete R package for multiple imputation of missing data. The package implements a new expectationmaximization with bootstrapping algorithm that works faster, with larger numbers of variables, and is far easier to use, than various Markov chain Monte Carlo approaches, but gives essentially the same answers. The program also improves imputation models by allowing researchers to put Bayesian priors on individual cell values, thereby including a great deal of potentially valuable and extensive information. It also includes features to accurately impute crosssectional datasets, individual time series, or sets of time series for different crosssections. A full set of graphical diagnostics are also available. The program is easy to use, and the simplicity of the algorithm makes it far more robust; both a simple command line and extensive graphical user interface are included.