## The essential role of pair-matching in cluster-randomized experiments, with application to the Mexican universal health insurance evaluation (2007)

### Cached

### Download Links

Citations: | 16 - 8 self |

### BibTeX

@TECHREPORT{Imai07theessential,

author = {Kosuke Imai},

title = {The essential role of pair-matching in cluster-randomized experiments, with application to the Mexican universal health insurance evaluation},

institution = {},

year = {2007}

}

### OpenURL

### Abstract

Abstract. A basic feature of many field experiments is that investigators are only able to randomize clusters of individuals—such as households, communities, firms, medical practices, schools or classrooms—even when the individual is the unit of interest. To recoup the resulting efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, many other studies avoid pairing, in part because of claims in the literature, echoed by clinical trials standards organizations, that this matched-pair, cluster-randomization design has serious problems. We argue that all such claims are unfounded. We also prove that the estimator recommended for this design in the literature is unbiased only in situations when matching is unnecessary; its standard error is also invalid. To overcome this problem without modeling assumptions, we develop a simple design-based estimator with much improved statistical properties. We also propose a model-based approach that includes some of the benefits of our design-based estimator as well as the estimator in the literature. Our methods also address individuallevel

### Citations

585 |
Identification of Causal Effects Using Instrumental Variables
- Angrist, Imbens, et al.
- 1996
(Show Context)
Citation Context ...onsider the two types of causal quantities of interest in matched-pair cluster-randomized encouragement designs – the intention-to-treat (ITT) effect and the complier average causal effect 27s(CACE) (=-=Angrist et al., 1996-=-). The ITT effect is the average causal effect of encouragement (rather than treatment) and is equivalent to the various versions of the average treatment effect in Section 3.3 (i.e., SATE, CATE, UATE... |

424 | Statistics and causal inference - Holland - 1986 |

402 | Statistics for experimenters - Box, Hunter, et al. - 1978 |

395 | Design of Experiments - Fisher - 1935 |

363 |
Statistical methods
- Snedecor, Cochran
- 1980
(Show Context)
Citation Context ...hypothesis testing, denoted by α and β, respectively. In particular, the goal is to calculate the sample size required to achieve a given degree of power, 1−β, against a particular alternative (e.g., =-=Snedecor and Cochran, 1989-=-, Section 6.14). Such a calculation can be conducted by using the power functions derived in Section 5.3.1. For example, suppose that the estimand is UATE and cluster sizes are equal. Then, using Equa... |

356 | Nonparametric Estimation of Average Treatment Effects under Exogeneity: A - Imbens |

192 | On the application of probability theory to agricultural experiments. Essay on principles. Translated by D. M. Dabrowska and edited by T - Neyman - 1990 |

176 |
Sampling: Design and analysis
- Lohr
- 1999
(Show Context)
Citation Context ...fact, ICC is of little use for design-based nonparametric analysis of cluster randomized experiments for more general reasons. Most importantly, “The ICC is only defined for clusters of equal sizes” (=-=Lohr, 1999-=-, p.140) whereas most cluster-randomized experiments involve clusters of unequal sizes. When cluster sizes vary, an alternative measure of within-cluster homogeneity sometimes used in the literature i... |

154 |
The Planning of Experiments
- Cox
- 1958
(Show Context)
Citation Context ...s the difference between two unit-level potential outcomes that are the functions of the cluster-level treatment variable. Thus, in cluster-randomized trials, the usual assumption of no interference (=-=Cox, 1958-=-; Rubin, 1990) needs to be made only at the cluster level. Moreover, in the case of the matched-pair design, assuming no interference only between pairs of clusters is sufficient. This advantage of ma... |

127 |
Design and Analysis of Cluster Randomization Trials in Health Research
- Donner, In
(Show Context)
Citation Context ...ative reasons, researchers conducting field experiments are often unable to randomize treatment assignment to individuals and so instead randomize treatments to clusters of individuals (Murray, 1998; =-=Donner and Klar, 2000-=-; Raudenbush et al., 2007). For example, 19 (68%) of the 28 field experiments we found published in major political science journals since 2000 randomized households, precincts, city-blocks, or villag... |

57 |
Simple Sample Size Calculation for Cluster Randomized Trials
- Hayes, Bennett
- 1999
(Show Context)
Citation Context ...on a weight using the harmonic mean of sample cluster sizes, which using our notation can be written as ˆ ψ(n1kn2k/(n1k +n2k)) (see e.g., Donner, 1987; Donner and Donald, 1987; Donner and Klar, 1993; =-=Hayes and Bennett, 1999-=-; Bloom, 2006; Raudenbush, 1997; Turner et al., 2007). As we show below, however, this estimator relies on assumptions unlikely to be met in practice, and has undesirable properties if those assumptio... |

46 |
The Effect of High School Matriculation Awards: Evidence from Randomized Trials
- Angrist, Lavy
- 2004
(Show Context)
Citation Context ...hough individuals are the units of interest (e.g., Sommer et al., 1986; Varnell et al., 2004); and numerous education researchers randomize schools, classrooms, or teachers instead of students (e.g., =-=Angrist and Lavy, 2002-=-). Since statistical efficiency drops when randomizing clusters of individuals instead of individuals themselves (Cornfield, 1978), many researchers attempt to recoup a portion of this lost efficiency... |

44 |
Statistical analysis and optimal design for cluster randomized trials
- Raudenbush
- 1997
(Show Context)
Citation Context ... sample cluster sizes, which using our notation can be written as ˆ ψ(n1kn2k/(n1k +n2k)) (see e.g., Donner, 1987; Donner and Donald, 1987; Donner and Klar, 1993; Hayes and Bennett, 1999; Bloom, 2006; =-=Raudenbush, 1997-=-; Turner et al., 2007). As we show below, however, this estimator relies on assumptions unlikely to be met in practice, and has undesirable properties if those assumptions are not met. We also prove t... |

34 | Misunderstandings among Experimentalists and Observationalists about Causal Inference - Imai, King, et al. - 2008 |

31 | Estimating causal effects - MALDONADO, GREENLAND - 2002 |

25 | Randomization by group: A formal analysis - Cornfield - 1978 |

24 | JM: Cluster trials in implementation research: estimation of intracluster correlation coefficients and sample size. Stat Med 2001 - MK, Mollison, et al. |

21 |
Pitfalls and controversies in cluster randomization trials
- Donner, Klar
(Show Context)
Citation Context .... . . [causal effects across clusters], and difficulties in estimating the intracluster correlation coefficient, a measure of similarity among cluster members” (Klar and Donner 1997, p.1754; see also =-=Donner and Klar 2004-=-), are also made by many other researchers and even various clinical trial standards organizations (e.g., Feng et al., 2001; Medical Research Council, 2002). Another reason to worry about the matched-... |

19 |
Optimal multivariate matching before randomization
- Greevy, Lu, et al.
- 2004
(Show Context)
Citation Context ...g., Ball and Bogatz, 1972; Gail et al., 1992; Hill et al., 1999). Since matching prior to random treatment assignment can greatly improve the efficiency of causal effect estimation (Box et al., 1978; =-=Greevy et al., 2004-=-; Imai et al., In-press), and matching in pairs can be substantially more efficient than matching in larger blocks, matched-pair, cluster-randomization would appear to be a very attractive design for ... |

19 | What Do Randomized Studies of Housing Mobility Demonstrate? Causal Inference in the Face of Interference
- Sobel
(Show Context)
Citation Context ...n Yijk(T). Since T1k = Zk and T2k = 1 − Zk, it is clear that Yijk(Tjk) only depends on Zk. Given that in many social experiments the assumption of no interference between units is highly unrealistic (=-=Sobel, 2006-=-), cluster-randomized trials offer an attractive alternative in the field experiments where social interactions among units are expected to occur. Indeed, in our Mexican experiment, this assumption is... |

19 |
Design and analysis of group-randomized trials: A review of recent methodological developments
- Murray, Varnell, et al.
- 2004
(Show Context)
Citation Context ...l, 2004), randomization occurs at the level of health clinics, physicians, or other administrative and geographical units even though individuals are the units of interest (e.g., Sommer et al., 1986; =-=Varnell et al., 2004-=-); and numerous education researchers randomize schools, classrooms, or teachers instead of students (e.g., Angrist and Lavy, 2002). Since statistical efficiency drops when randomizing clusters of ind... |

15 | Some aspects of the design and analysis of cluster randomi/Nation trials - Donner |

14 | A , Spybrook J . Strategies for improving precision in group-randomized experiments . Educ Eval Policy Anal
- SW
(Show Context)
Citation Context ...ers conducting field experiments are often unable to randomize treatment assignment to individuals and so instead randomize treatments to clusters of individuals (Murray, 1998; Donner and Klar, 2000; =-=Raudenbush et al., 2007-=-). For example, 19 (68%) of the 28 field experiments we found published in major political science journals since 2000 randomized households, precincts, city-blocks, or villages even though individual... |

12 | Impact of Vitamin A Supplementation on Childhood Mortality - Sommer - 1986 |

11 | Clustered Encouragement Designs with Individual Noncompliance: Bayesian Inference with Randomization, and Application to Advance Directive Forms (with discussion).” Biostatistics 3:147–164 - Frangakis, Rubin, et al. - 2002 |

11 | The effect of matching on the power of randomized community intervention studies. Stat Med - DC, Diehr, et al. - 1993 |

11 | Interference Between Units in Randomized Experiments - Rosenbaum - 2007 |

10 | Optimal permutation tests for the analysis of group randomized trials - Braun, Feng - 2001 |

10 |
Comments on “On the application of probability theory to agricultural experiments. Essay on principles. Section 9” by J. Splawa-Neyman translated from the Polish and edited by
- Rubin
- 1990
(Show Context)
Citation Context ...rence between two unit-level potential outcomes that are the functions of the cluster-level treatment variable. Thus, in cluster-randomized trials, the usual assumption of no interference (Cox, 1958; =-=Rubin, 1990-=-) needs to be made only at the cluster level. Moreover, in the case of the matched-pair design, assuming no interference only between pairs of clusters is sufficient. This advantage of matched-pair de... |

9 |
Using cluster randomized field experiments to study voting behavior.” The Annals of the American Academy of Political and Social Science 601(1):169
- Arceneaux
(Show Context)
Citation Context ...periments we found published in major political science journals since 2000 randomized households, precincts, city-blocks, or villages even though individual voters were the inferential target (e.g., =-=Arceneaux, 2005-=-); in public health and medicine, where “the number of trials reporting a cluster design has risen exponentially since 1997” (Campbell, 2004), randomization occurs at the level of health clinics, phys... |

9 |
Statistics for experimenters. Wiley-Interscience, 1st edition edition
- Box, Hunter, et al.
- 1978
(Show Context)
Citation Context ...ent assignment (e.g., Ball and Bogatz, 1972; Gail et al., 1992; Hill et al., 1999). Since matching prior to random treatment assignment can greatly improve the efficiency of causal effect estimation (=-=Box et al., 1978-=-; Greevy et al., 2004; Imai et al., In-press), and matching in pairs can be substantially more efficient than matching in larger blocks, matched-pair, cluster-randomization would appear to be a very a... |

9 |
A ‘politically robust’ experimental design for public policy evaluation, with application to the mexican universal health insurance program.” Journal of Policy Analysis and Management 26(3
- King, Gakidou, et al.
- 2007
(Show Context)
Citation Context ...erventions by politicians and others that have ruined many policy evaluations, such as when office-holders unexpectedly arrange program benefits for their constituents in some control group clusters (=-=King et al., 2007-=-). Unfortunately, despite its apparent benefits and common usage, this experimental design has an uncertain status within parts of the methodological literature. For example, Klar and Donner (1997) cl... |

9 | The merits of matching in community intervention trials: a cautionary tale. Stat Med - Klar, Donner - 1997 |

8 |
Confidence interval construction for effect measures arising from cluster randomization trials
- Donner, Klar
- 1993
(Show Context)
Citation Context ...al literature is based on a weight using the harmonic mean of sample cluster sizes, which using our notation can be written as ˆ ψ(n1kn2k/(n1k +n2k)) (see e.g., Donner, 1987; Donner and Donald, 1987; =-=Donner and Klar, 1993-=-; Hayes and Bennett, 1999; Bloom, 2006; Raudenbush, 1997; Turner et al., 2007). As we show below, however, this estimator relies on assumptions unlikely to be met in practice, and has undesirable prop... |

8 |
Selected statistical issues in group randomized trials
- Feng, Diehr, et al.
- 2001
(Show Context)
Citation Context ...similarity among cluster members” (Klar and Donner 1997, p.1754; see also Donner and Klar 2004), are also made by many other researchers and even various clinical trial standards organizations (e.g., =-=Feng et al., 2001-=-; Medical Research Council, 2002). Another reason to worry about the matched-pair cluster randomized design is that, to our knowledge, there exists no published formal evaluation of the statistical pr... |

8 | Standardization: A technique to control for extraneous variables - Kalton - 1968 |

7 | The design of the New York School Choice Scholarship Program evaluation - Hill, Rubin, et al. - 1999 |

6 |
The Core Analytics of Randomized Experiments for Social Research. Working Paper
- Bloom
(Show Context)
Citation Context ...monic mean of sample cluster sizes, which using our notation can be written as ˆ ψ(n1kn2k/(n1k +n2k)) (see e.g., Donner, 1987; Donner and Donald, 1987; Donner and Klar, 1993; Hayes and Bennett, 1999; =-=Bloom, 2006-=-; Raudenbush, 1997; Turner et al., 2007). As we show below, however, this estimator relies on assumptions unlikely to be met in practice, and has undesirable properties if those assumptions are not me... |

6 |
Evidence-based health policy: three generations of reform in Mexico." Lancet 362(9396
- Frenk, Sepulveda
- 2003
(Show Context)
Citation Context ...we are conducting of Seguro Popular de Salud (SPS, the Mexican universal health insurance program). The program’s “aim is to provide social protection in health to the 50 million uninsured Mexicans” (=-=Frenk et al., 2003-=-, p.1667), constituting about half the population, through one of the largest health policy reforms in any country in the last twenty years (King et al., 2007). The government intends to spend an addi... |

6 | On design considerations and randomization-based inference for community intervention trials. Stat. Med - Gail, Mark, et al. - 1996 |

6 | The Cochrane Handbook for Systematic Reviews - JP, Green |

6 | Public policy for the poor? A randomised assessment of the Mexican universal health insurance programme " Lancet 373 - King, Gakidou - 2009 |

5 |
CONSORT statement: extension to cluster randomised trials
- ampbell, Elbourne, et al.
- 2004
(Show Context)
Citation Context ...uct and analysis of cluster-randomized experiments, which closely follow current methodological literature. These include the extension to the “CONSORT” agreement among the major biomedical journals (=-=Campbell et al., 2004-=-), the Cochrane Collaboration requirements for reviewing research (Higgins and Green, 2006, sec. 8.11.2), the prominent Medical Research Council (2002) guidelines, and the education research What Work... |

5 |
Donald A: Analysis of data arising from stratified design with the cluster as unit of randomization. Stat Med
- Donner
- 1987
(Show Context)
Citation Context ...ended in the methodological literature is based on a weight using the harmonic mean of sample cluster sizes, which using our notation can be written as ˆ ψ(n1kn2k/(n1k +n2k)) (see e.g., Donner, 1987; =-=Donner and Donald, 1987-=-; Donner and Klar, 1993; Hayes and Bennett, 1999; Bloom, 2006; Raudenbush, 1997; Turner et al., 2007). As we show below, however, this estimator relies on assumptions unlikely to be met in practice, a... |

5 | Finite Mixture Models - McLaughlan - 2000 |

4 |
MJ: Extending CONSORT to include cluster trials
- Campbell
(Show Context)
Citation Context ...hough individual voters were the inferential target (e.g., Arceneaux, 2005); in public health and medicine, where “the number of trials reporting a cluster design has risen exponentially since 1997” (=-=Campbell, 2004-=-), randomization occurs at the level of health clinics, physicians, or other administrative and geographical units even though individuals are the units of interest (e.g., Sommer et al., 1986; Varnell... |

4 | Variance identification and efficiency analysis in randomized experiments under the matched-pair design.” Statistics in medicine 27(24):4857 - Imai |

4 | Statistical methods. 8th ed. Iowa State Univ. Press, Ames. R e p ro d u c e d fr o m S o il S c ie n c e S o c ie ty o f A m e ri c a J o u rn a l. P u b lis h e d b y S o il S c ie n c e S o c ie ty o f A m e ri c a . A ll c o p y ri g h ts re s e rv e d - Snedecor, Cochran - 1989 |

3 |
Statistical methodology for paired cluster designs
- Donner
- 1987
(Show Context)
Citation Context ...ommonly recommended in the methodological literature is based on a weight using the harmonic mean of sample cluster sizes, which using our notation can be written as ˆ ψ(n1kn2k/(n1k +n2k)) (see e.g., =-=Donner, 1987-=-; Donner and Donald, 1987; Donner and Klar, 1993; Hayes and Bennett, 1999; Bloom, 2006; Raudenbush, 1997; Turner et al., 2007). As we show below, however, this estimator relies on assumptions unlikely... |

3 | Randomization-based inference and efficiency analysis in experiments under the matched-pair design - Imai - 2007 |