## Matching as Nonparametric Preprocessing for Reducing Model Dependence (2007)

### Cached

### Download Links

Venue: | in Parametric Causal Inference,” Political Analysis |

Citations: | 94 - 32 self |

### BibTeX

@INPROCEEDINGS{Ho07matchingas,

author = {Daniel E. Ho and Kosuke Imai and Gary King and Elizabeth A. Stuart},

title = {Matching as Nonparametric Preprocessing for Reducing Model Dependence},

booktitle = {in Parametric Causal Inference,” Political Analysis},

year = {2007},

pages = {199--236}

}

### OpenURL

### Abstract

Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the presented estimates from numerous trial runs readers never see. Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author’s favorite hypothesis? And how do we evaluate or even define statistical properties like unbiasedness or mean squared error when no unique model or estimator even exists? Matching methods, which offer the promise of causal inference with fewer assumptions, constitute one possible way forward, but crucial results in this fast-growing methodological

### Citations

4828 |
Neural Networks for Pattern Recognition
- Bishop
- 1995
(Show Context)
Citation Context ...s that preprocess data so that subsequent analyses can be improved without modifying existing techniques, such as multiple imputation (Rubin 1987; King et al. 2001) and outlier and feature detection (=-=Bishop 1995-=-, chap. 8).s202 Daniel E. Ho et al. conditions for matching as a general method of nonparametric preprocessing, suitable for improving any parametric method. Our general preprocessing strategy also ma... |

1012 |
The central role of the propensity score in observational studies for causal effects
- Rosenbaum, Rubin
- 1983
(Show Context)
Citation Context ...n support, see Iacus and Porro (2006). 6.4 The Propensity Score Tautology A commonly used matching procedure is to summarize all the variables in X with a single variable called the propensity score (=-=Rosenbaum and Rubin 1983-=-). The propensity score is the true probability of unit i receiving treatment, given the covariates Xi, e(Xi) 5 p(Ti 5 1|Xi). It is usually estimated via a logistic regression of Ti on a constant term... |

413 | Matching as an Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme - Heckman, Ichimura, et al. - 1997 |

312 | The Design of Experiments - Fisher - 1935 |

304 |
Characterizing Selection Bias Using Experimental Data
- Heckman, Ichimura, et al.
- 1998
(Show Context)
Citation Context ... treatment. The theoretical literature emphasizes that including variables only weakly related to treatment assignment usually reduces bias more than it will increase variance (Rubin and Thomas 1996; =-=Heckman et al. 1998-=-), and so most believe that all available control variables should always be included. However, the theoretical literature has focused primarily on the case where the pool of potential control units i... |

304 |
Statistics and causal inference
- Holland
- 1986
(Show Context)
Citation Context ...inate (Ti 5 0) an incumbent to run in district i, one of these potential outcomes is always a counterfactual and thus never observed. This is known as the ‘‘fundamental problem of causal inference’’ (=-=Holland 1986-=-). 2.2 Random Causal Effects Now, imagine that the potential outcomes in equation (1) are realizations of corresponding random variables (for which we use the corresponding capital letters). This prod... |

275 |
Causal Effects in Non-experimental Studies: Reevaluating the Evaluation of Training Programmes
- Dehejia, Wahba
- 1999
(Show Context)
Citation Context ...too (Smith 1997). If, instead, fewer controls are available than those treated, then matching with replacement—allowing each control unit to be matched to more than one treated unit—is a good option (=-=Dehejia and Wahba 1999-=-). Alternatively, we can consider switching the definition of treatment and control groups 14 When matching without replacement, two different approaches of matching nearest neighbors are available. T... |

275 |
Alternative methods for evaluating the impact of interventions: An overview
- Heckman, Robb
- 1985
(Show Context)
Citation Context ...hat Xi must include all variables that are causally prior to Ti, associated with Ti, and affect Yi conditional on Ti (Goldberger 1991; King, Keohane, and Verba 1994), or ‘‘selection on observables’’ (=-=Heckman and Robb 1985-=-). In statistics, this same condition is known as ‘‘ignorability,’’ which means that Ti and the unobserved potential outcomes are independent after conditioning on Xi and the observed potential outcom... |

266 | Counterfactuals - Lewis - 1973 |

236 | Nonparametric Estimation of Average Treatment Effects under Exogeneity: A - Imbens - 2004 |

202 |
Does Matching Overcome LaLonde's Critique of Nonexperimental Estimators
- Smith, Todd
- 2005
(Show Context)
Citation Context ...en we use it. If not, we try even more elaborate specifications (such as other functional forms such as CART, neural network analyses, or others) or more sophisticated matching methods (Frölich 2004; =-=Smith and Todd 2005-=-). 6.5 Deciding Which Observations to Match The collective wisdom of the theoretical literature recommends the following three procedures for the actual process of choosing matched data sets. Unfortun... |

168 | Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71 - Hirano, Imbens, et al. - 2003 |

145 | Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation
- King, Honaker, et al.
- 2001
(Show Context)
Citation Context ...ea is also similar in spirit to methods in other areas that preprocess data so that subsequent analyses can be improved without modifying existing techniques, such as multiple imputation (Rubin 1987; =-=King et al. 2001-=-) and outlier and feature detection (Bishop 1995, chap. 8).s202 Daniel E. Ho et al. conditions for matching as a general method of nonparametric preprocessing, suitable for improving any parametric me... |

122 |
Planning of Experiments
- Cox
- 1958
(Show Context)
Citation Context ...llect in a vector denoted Xi. Whether preprocessing or not, variables that are even in part a consequence of the treatment variable should never be controlled for when estimating a causal effect (see =-=Cox 1958-=-, sec. 4.2; Rosenbaum 1984; Rosenbaum 2002, 73–4). This is of course a critical point since controlling for the consequences of a causal variable can severely bias a causal inference. For example, con... |

119 | Some thoughts on the distribution of earnings. Oxford Economic Papers - Roy - 1951 |

94 |
A Course in Econometrics
- Goldberger
- 1991
(Show Context)
Citation Context ...cal methodology and econometrics as the absence of ‘‘omitted variable bias,’’ so that Xi must include all variables that are causally prior to Ti, associated with Ti, and affect Yi conditional on Ti (=-=Goldberger 1991-=-; King, Keohane, and Verba 1994), or ‘‘selection on observables’’ (Heckman and Robb 1985). In statistics, this same condition is known as ‘‘ignorability,’’ which means that Ti and the unobserved poten... |

92 | Implementing matching estimators for average treatment effects in stata
- Abadie, Drukker, et al.
- 2004
(Show Context)
Citation Context ...e of the two is incorrectly specified), causal estimates will still be consistent. 10 Appendix: Matching Software A variety of excellent software is available to perform matching (Parsons 2000, 2001; =-=Abadie et al. 2002-=-; Becker and Ichino 2002; Bergstralh and Kosanke 2003; Leuven and Sianesi 2004; Sekhon 2004; Hansen 2005). However, each program implements only a specialized subset of available statistical procedure... |

83 |
How Robust is the Evidence on the Effects of College Quality? Evidence from Matching
- Black, Smith
- 2004
(Show Context)
Citation Context ...proaches too, and so they should not be considered competitors. Other seemingly possible alternatives, such as Bayesian model averaging (Hoeting et al. 1999; Imai and King 2004) and cross-validation (=-=Black and Smith 2004-=-), are useful for predictive inference but not directly applicable in the context of causal inference. 2 Definition of Causal Effects The notation, ideas, and running example in this section parallel ... |

64 |
Estimation of average treatment effects based on propensity scores
- BECKER, ICHINO
- 2002
(Show Context)
Citation Context ...rrectly specified), causal estimates will still be consistent. 10 Appendix: Matching Software A variety of excellent software is available to perform matching (Parsons 2000, 2001; Abadie et al. 2002; =-=Becker and Ichino 2002-=-; Bergstralh and Kosanke 2003; Leuven and Sianesi 2004; Sekhon 2004; Hansen 2005). However, each program implements only a specialized subset of available statistical procedures. Moreover, they are sp... |

58 |
Unifying Political Methodology: The Likelihood Theory of Statistical Inference. Ann Arbor
- King
- 1989
(Show Context)
Citation Context ...s central: Since maximum likelihood is invariant to reparameterization—meaning, for example, that the maximum likelihood estimate (MLE) of a is the same as the positive square root of the MLE of a 2 (=-=King 1989-=-, 75–6)—we get the same estimate of the expected potential outcomes no matter how gð Þ is defined. 6 When l0 and l1 are a function of Xi, the choice of gð Þ is a difficult substantive decision typical... |

56 |
The Consquences of Adjustment for a Concomitant Variable That Has Been Affected by the Treatment
- Rosenbaum
- 1984
(Show Context)
Citation Context ...noted Xi. Whether preprocessing or not, variables that are even in part a consequence of the treatment variable should never be controlled for when estimating a causal effect (see Cox 1958, sec. 4.2; =-=Rosenbaum 1984-=-; Rosenbaum 2002, 73–4). This is of course a critical point since controlling for the consequences of a causal variable can severely bias a causal inference. For example, controlling for aggregate vot... |

50 |
Bayesian model averaging: A tutorial. Statistical Science 14:382–417
- Hoeting, Madigan, et al.
- 1999
(Show Context)
Citation Context ...processing via matching works well in combination with these approaches too, and so they should not be considered competitors. Other seemingly possible alternatives, such as Bayesian model averaging (=-=Hoeting et al. 1999-=-; Imai and King 2004) and cross-validation (Black and Smith 2004), are useful for predictive inference but not directly applicable in the context of causal inference. 2 Definition of Causal Effects Th... |

41 | Statistical problems in agricultural experiments. Supplement to the - Neyman - 1935 |

39 |
Comparison of multivariate matching methods: Structure, distances, and algorithms
- Gu, Rosenbaum
- 1993
(Show Context)
Citation Context ... 15 If some of the variables in X represent binary variables with very few in one category, common practice is to include them in the propensity score but not in the Mahalanobis distance calculation (=-=Gu and Rosenbaum 1993-=-; Rubin and Thomas 2000). Finally, if finding a matching procedure with good balance and a large number of observations is difficult, subclassification can be a useful technique (Imai and van Dyk 2004... |

36 |
The effectiveness of adjustment by subclassification in removing bias in observational studies
- Cochran
- 1968
(Show Context)
Citation Context ...nstruction, approximately constant and thus balanced. Many rely on the theoretical result that five or six subclasses are sufficient to adjust for a univariate covariate such as the propensity score (=-=Cochran 1968-=-; Rosenbaum and Rubin 1984), but applied researchers have not fully appreciated that as n increases more subclasses are generally preferable. In addition, the number and definition of the subclasses s... |

32 | Causal Inference with General Treatment Regimes: Generalizing the Propensity Score - Imai, Dyk - 2004 |

31 | Misunderstandings among experimentalists and observationalists: Balance test fallacies in causal inference. http://gking.harvard.edu/files/abs/matchse-abs.shtml (accessed September 1 - Imai, King, et al. - 2006 |

31 | The use of matched sampling and regression adjustment to remove bias in observational studies - Rubin - 1973 |

28 | The dangers of extreme counterfactuals. Political Analysis 14:131–59 - King, Zeng - 2006 |

28 | Multivariate and propensity score matching software with automated balance optimization: The matching package for r - Sekhon |

25 | 2004a), “Full Matching in an Observational Study of Coaching for the SAT - Hansen |

25 |
Matching with Multiple Controls to Estimate Treatment Effects
- Smith
- 1997
(Show Context)
Citation Context ... generally reduce both bias and variance of estimates from subsequent parametric analyses. Finally we note that, although matching discards data, it can actually increase the efficiency of estimates (=-=Smith 1997-=-). This may seem counterintuitive, as it would seem to violate a first principle of statistics, informally described as ‘‘more data are better.’’ However, more data are in fact better only when using ... |

24 | Nonexperimental versus experimental estimates of earnings impacts - Glazerman, Levy, et al. - 2003 |

24 | Techniques for Estimating Switching Regressions - Goldfeld, Quandt - 1976 |

20 | 2010b): "A note on the common support problem in applied evaluation studies", Annales d'Économie et de Statistique - Lechner |

20 |
Characterizing the effect of matching using linear propensity score methods with normal distributions. Biometrika 79:797–809
- Rubin, Thomas
- 1992
(Show Context)
Citation Context ...heoretical and simulation results that, in a wide range of scenarios, using matched samples can result in substantial bias and variance reduction, compared with using random samples of the same size (=-=Rubin and Thomas 1992-=-, 1996). Similarly, Imai and van Dyk (2004) found reductions in both bias and variance when using subclassification on estimated propensity scores, compared with analyses based on the full data. To be... |

17 | Controlling bias in observational studies: A review. Sankhya: The Indian Journal of Statistics, Series A 35(Part 4):417–66 - Cochran, Rubin - 1973 |

15 | Genetic matching for estimating causal effects: A new method of achieving balance in observational studies. http://sekhon.berkeley.edu/ (accessed September 1 - Diamond, Sekhon - 2005 |

12 | Affinely Invariant Matching Methods with Discriminant Mixtures of Proportional Ellipsoidally Symmetric Distributions.” Working Paper
- Rubin, Stuart
- 2005
(Show Context)
Citation Context ...istributions like the normal or t within classes defined by categorical covariates; Rubin and Thomas 1996), as well as ‘‘discriminant mixtures of proportional ellipsoidally symmetric distributions’’ (=-=Rubin and Stuart 2006-=-).sMatching as Nonparametric Preprocessing for Reducing Model Dependence 215 a fraction of observations, and so variance usually does drop following properly applied matching. Third, the ultimate goal... |

9 | the media, agency waiting costs, and FDA drug approval - Carpenter |

8 | Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference - King, Stuart - 2007 |

7 | Random recursive partitioning: A matching method for the estimation of the average treatment effect - Iacus, Porro - 2006 |

7 |
Matching and Thick Description in an Observational Study of Mortality After Surgery
- Rosenbaum, Silber
- 2001
(Show Context)
Citation Context ... are not available to match on (Rosenbaum 2002, chap. 3). Indeed, preprocessing can help researchers better understand their data when supplemented by good qualitative information and research (e.g., =-=Rosenbaum and Silber 2001-=-). If meeting these criteria for balance proves impossible, we then need to recognize that preprocessing by matching may not be helpful. Unfortunately, if preprocessing is 16 For example, the maximum ... |

4 | Did illegal overseas absentee ballots decide the 2000 U.S. presidential election? Perspectives on Politics 2(September):537–49 - Imai, King - 2004 |

3 |
Finite sample properties of propensity score matching and weighting estimators. Review of Econometrics and Statistics 86:77–90
- Frölich
- 2004
(Show Context)
Citation Context ...that works, then we use it. If not, we try even more elaborate specifications (such as other functional forms such as CART, neural network analyses, or others) or more sophisticated matching methods (=-=Frölich 2004-=-; Smith and Todd 2005). 6.5 Deciding Which Observations to Match The collective wisdom of the theoretical literature recommends the following three procedures for the actual process of choosing matche... |

3 | Using SAS software to perform a case-control match on propensity score in an observational study. http://www2.sas.com/proceedings/sugi25/25/po/25p225.pdf (accessed September 1 - Parsons - 2006 |

2 | Zelig: Everyone’s statistical software. http://gking.harvard.edu/ zelig (accessed September 1 - Imai, King, et al. - 2006 |

2 | Gender stereotypes and citizens’ impressions of house candidates’ ideological orientation - Koch |

1 |
2006a. Estimation of the conditional variance in paired experiments. KSG working paper. http://ksghome.harvard.edu/;.aabadie.academic.ksg/cve.pdf (accessed September 1
- Abadie, Imbens
- 2006
(Show Context)
Citation Context ...rametric estimation is to make as few assumptions as possible, the variance estimation as well as point estimation tend to be based on complicated and sometimes application-specific procedures (e.g., =-=Abadie and Imbens 2006-=-a). In contrast, our perspective (which is similar to the special cases analyzed by some statisticians; for example, Rubin and Thomas 2000) is to begin with what social scientists are now doing, which... |

1 |
Large sample properties of matching estimators for average treatment effects. Econometrica 74:235–67
- 2006b
(Show Context)
Citation Context ...ities and Social Sciences for research support. Software to implement the methods in this paper is available at http://GKing.Harvard.edu/ matchit and a replication data file is available as Ho et al. =-=(2006)-=-. Ó The Author 2007. Published by Oxford University Press on behalf of the Society for Political Methodology. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.or... |