## The Foundations of Causal Inference (2010)

### Cached

### Download Links

Venue: | SUBMITTED TO SOCIOLOGICAL METHODOLOGY. |

Citations: | 6 - 2 self |

### BibTeX

@MISC{Pearl10thefoundations,

author = {Judea Pearl},

title = {The Foundations of Causal Inference },

year = {2010}

}

### OpenURL

### Abstract

This paper reviews recent advances in the foundations of causal inference and introduces a systematic methodology for defining, estimating and testing causal claims in experimental and observational studies. It is based on non-parametric structural equation models (SEM) – a natural generalization of those used by econometricians and social scientists in the 1950-60s, and provides a coherent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring the effects of potential interventions (also called “causal effects” or “policy evaluation”), as well as direct and indirect effects (also known as “mediation”), in both linear and non-linear systems. Finally, the paper clarifies the role of propensity score matching in causal analysis, defines the relationships between the structural and

### Citations

7054 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...tality of all those assumptions implies that Z is unassociated with Y in every stratum of X. Such testable implications can be read off the diagrams using a graphical criterion known as d-separation (=-=Pearl, 1988-=-). Definition 1 (d-separation) A set S of nodes is said to block a path p if either (1) p contains at least one arrow-emitting node that is in S, or (2) p contains at least one collision node that is ... |

1372 | The mediator–moderator variable distinction in social psychological research, conceptual, strategic, and statistical considerations
- Baron, Kenny
- 1986
(Show Context)
Citation Context ...th coefficients associated with the structural equations. Yet despite its ubiquity, the analysis of mediation has long been a thorny issue in the social and behavioral sciences (Judd and Kenny, 1981; =-=Baron and Kenny, 1986-=-; Muller et al., 2005; Shrout and Bolger, 2002; MacKinnon et al., 2007a) primarily because structural equation modeling in those sciences were deeply entrenched in linear analysis, where the distincti... |

1069 |
Econometric analysis of cross section and panel data
- Wooldridge
- 2002
(Show Context)
Citation Context ...nships of this type explain the slow acceptance of causal analysis among health scientists and statisticians, and why most economists and social scientists continue to use structural equation models (=-=Wooldridge, 2002-=-; Stock and Watson, 2003; Heckman, 2008) instead of the potential-outcome alternatives advocated in Angrist et al. (1996); Holland (1988); Sobel (1998); and Sobel (2008). On the other hand, the algebr... |

1014 |
DB: The central role of the propensity score in observational studies for causal effects. Biometrika
- Rosenbaum, Rubin
- 1983
(Show Context)
Citation Context ...ematically to diagrams of any size and shape, thus freeing analysts from judging whether “X is conditionally ignorable given S,” a formidable mental task required in the potential-response framework (=-=Rosenbaum and Rubin, 1983-=-). The criterion also enables the analyst to search for an optimal set of covariates—namely, a set S that minimizes measurement cost or sampling variability (Tian et al., 1998). All in all, one can sa... |

749 |
Structural Equations with Latent Variables
- Bollen
- 1989
(Show Context)
Citation Context ...omitted factors, we allow in effect for the presence of latent variables affecting both X and Y , as shown explicitly in Figure 1(c), which is the standard representation in the SEM literature (e.g., =-=Bollen, 1989-=-). In contrast to traditional latent variable models, however, our attention will not be focused on the connections among such latent variables but, rather, on the causal effects that those variables ... |

496 | Causation, Prediction, and Search - Spirtes, Glymour, et al. - 1993 |

468 |
Estimating causal effects of treatments in randomized and non randomized studies
- Rubin
- 1974
(Show Context)
Citation Context ...y of the judgments upon which the analysis so crucially depends. How do we recognize causal expressions in the social science literature? Those versed in the potential-outcome notation (Neyman, 1923; =-=Rubin, 1974-=-; Holland, 1988; Sobel, 1996) can recognize such expressions through the subscripts that are attached to counterfactual events and variables—for exam2 By “untested” I mean untested using frequency dat... |

441 |
Graphical models in applied multivariate statistics
- Whittaker
- 1990
(Show Context)
Citation Context ... defined procedurally by β ∆ = E(Y |do(x0 + 1)) − E(Y |do(x0)) = ∂ ∂ E(Y |do(x)) = ∂x ∂x E(Yx). Naturally, all attempts to give β statistical interpretation have ended in frustrations (Holland, 1988; =-=Whittaker, 1990-=-; Wermuth, 1992; Wermuth and Cox, 1993), some persisting well into the twenty-first century (Sobel, 2008).where assumptions are conveyed through the missing arrows in the diagram. If numerical or fun... |

429 |
Identification of causal effects using instrumental variables
- Angrist, Imbens, et al.
- 1996
(Show Context)
Citation Context ...; and Sobel (2008). On the other hand, the algebraic machinery offered by the counterfactual notation, Yx(u), once a problem is properly formalized, can be extremely powerful in refining assumptions (=-=Angrist et al., 1996-=-; Heckman and Vytlacil, 2005), deriving consistent estimands (Robins, 1986), bounding probabilities of necessary and sufficient causation (Tian and Pearl, 2000), and combining data from experimental a... |

275 |
Causal Effects in Non-experimental Studies: Reevaluating the Evaluation of Training Programmes
- Dehejia, Wahba
- 1999
(Show Context)
Citation Context ...d untreated subjects somehow eliminates confounding from the data and contributes therefore to overall bias reduction. This tendency was further reinforced by empirical studies (Heckman et al., 1998; =-=Dehejia and Wahba, 1999-=-) in which agreement was found between propensity score analysis and randomized trials, and in which the agreement was attributed to the ability of the former to “balance” treatment and control groups... |

241 | The ModeratorMediator Variable Distinction - Baron, Kenny - 1986 |

217 |
Equivalence and synthesis of causal models
- Verma, Pearl
- 1990
(Show Context)
Citation Context ...b) Figure 2: The diagrams associated with (a) the structural model of equation (5) and (b) the modified model of equation (6), representing the intervention do(X = x0). ditions for model equivalence (=-=Verma and Pearl, 1990-=-; Ali et al., 2009) that are mathematically proven and should therefore supercede the heuristic (and occasionally false) rules prevailing in social science research (Lee and Hershberger, 1990). 3.2 Fr... |

205 | A theory of inferred causation - Pearl, Verma - 1991 |

202 |
Does Matching Overcome LaLonde's Critique of Nonexperimental Estimators
- Smith, Todd
- 2005
(Show Context)
Citation Context ...wever, have taken a more critical view of propensity scores, noting with disappointment that a substantial bias is sometimes measured when careful comparisons are made to results of clinical studies (=-=Smith and Todd, 2005-=-; Luellen et al., 2005; Peikes et al., 2008). The reason for these disappointments lie in a popular belief that adding more covariates can cause no harm (Rosenbaum, 2002, p. 76), which seems to absolv... |

186 |
Conditional independence in statistical theory
- Dawid
- 1979
(Show Context)
Citation Context ...it a unique solution to the query of interest. For example, if we can 26 The notation Y ⊥X|Z stands for the conditional independence relationship P(Y = y, X = x|Z = z) = P(Y = y|Z = z)P(X = x|Z = z) (=-=Dawid, 1979-=-).plausibly assume that in Figure 4 a set Z of covariates satisfies the conditional independence Yx ⊥X|Z (38) (an assumption termed “conditional ignorability” by Rosenbaum and Rubin (1983),) then the... |

173 | Matching as an econometric evaluation estimator. Review of Economic Studies 65 - Heckman, Ichimura, et al. - 1998 |

172 | Causal diagrams for empirical research - Pearl - 1995 |

162 |
Observational Studies
- Rosenbaum
- 2002
(Show Context)
Citation Context ...ults of clinical studies (Smith and Todd, 2005; Luellen et al., 2005; Peikes et al., 2008). The reason for these disappointments lie in a popular belief that adding more covariates can cause no harm (=-=Rosenbaum, 2002-=-, p. 76), which seems to absolve one from thinking about the causal relationships among those covariates, the treatment, the outcome and, most importantly, the confounders left unmeasured (Rubin, 2009... |

157 |
Matching as an econometric evaluation estimator
- Heckman, Ichimura, et al.
- 1998
(Show Context)
Citation Context ...L, matching treated and untreated subjects somehow eliminates confounding from the data and contributes therefore to overall bias reduction. This tendency was further reinforced by empirical studies (=-=Heckman et al., 1998-=-; Dehejia and Wahba, 1999) in which agreement was found between propensity score analysis and randomized trials, and in which the agreement was attributed to the ability of the former to “balance” tre... |

153 |
A new approach to causal inference in mortality studies with sustained exposure periods - Application to control of the healthy worker survivor effect
- Robins
- 1986
(Show Context)
Citation Context ...ioning, and the axioms of conditional independence. Naturally, these hypothetical entities are not entirely whimsy. They are assumed to be connected to observed variables via consistency constraints (=-=Robins, 1986-=-) such as X = x =⇒ Yx = Y, (35) which states that, for every u, if the actual value of X turns out to be x, then the value that Y would take on if “X were x” is equal to the actual value of Y (Pearl, ... |

145 |
On the Application of Probability Theory to Agricultural Experiments. Essay on Principles
- Neyman
- 1923
(Show Context)
Citation Context ...the reliability of the judgments upon which the analysis so crucially depends. How do we recognize causal expressions in the social science literature? Those versed in the potential-outcome notation (=-=Neyman, 1923-=-; Rubin, 1974; Holland, 1988; Sobel, 1996) can recognize such expressions through the subscripts that are attached to counterfactual events and variables—for exam2 By “untested” I mean untested using ... |

143 | Correlation and Causation - Wright - 1921 |

141 | Mediation in experimental and nonexperimental studies: New procedures and recommendations
- Shrout, Bolger
- 2002
(Show Context)
Citation Context ...al equations. Yet despite its ubiquity, the analysis of mediation has long been a thorny issue in the social and behavioral sciences (Judd and Kenny, 1981; Baron and Kenny, 1986; Muller et al., 2005; =-=Shrout and Bolger, 2002-=-; MacKinnon et al., 2007a) primarily because structural equation modeling in those sciences were deeply entrenched in linear analysis, where the distinction between causal parameters and their regress... |

138 | Issues and Opinion on Structural Equation Modeling - Chin - 1998 |

132 | JM: Causal diagrams for epidemiologic research. Epidemiology - Greenland, Pearl, et al. - 1999 |

117 | Recent developments in the econometrics of program evaluation - Imbens, Wooldridge - 2009 |

113 | Nonparametric bounds on treatment effects - Manski - 1990 |

95 | Comment: Graphical models, causality, and intervention
- Pearl
- 1993
(Show Context)
Citation Context ...of W3 on Y ), and, finally, we combine the two effects together and obtain P(y|do(x)) = ∑ P(w3|do(x))P(y|do(w3)). (28) w3 In this example, the variable W3 acts as a “mediating instrumental variable” (=-=Pearl, 1993-=-b; Chalak and White, 2006; Morgan and Winship, 2007). The analysis used in the derivation and validation of such results invokes mathematical rules of transforming causal quantities, represented by ex... |

95 | Identifiability and exchangeability for direct and indirect effects - Robins, Greenland - 1992 |

93 |
The Statistical Implications of a System of Simultaneous Equations
- Haavelmo
- 1943
(Show Context)
Citation Context ...hosen structural equation models and their associated causal diagrams as the primary language for causal analysis. Influenced by the pioneering work of Sewall Wright (1923) and early econometricians (=-=Haavelmo, 1943-=-; Simon, 1953; Marschak, 1950; Koopmans, 1953), Blalock (1964) and Duncan (1975) considered SEM a mathematical tool for drawing causal conclusions from a combination of observational data and theoreti... |

85 |
Causal Inference without Counterfactuals (with Discussion
- Dawid
- 2000
(Show Context)
Citation Context ...y to the joint statement “Y would be y if X = x and Y would be y ′ if X = x ′ .” 17 Such concerns have been a source of objections to treating counterfactuals as jointly distributed random variables (=-=Dawid, 2000-=-). The definition of Yx and Yx ′ in terms of two distinct submodels neutralizes these objections (Pearl, 2000b), since the contradictory joint statement is mapped into an ordinary event, one where the... |

83 |
A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators
- Muthén
- 1984
(Show Context)
Citation Context ...conomics and the behavioral and social sciences (Goldberger, 1972; Duncan, 1975; Bollen, 1989). However, the bulk of SEM methodology was developed for linear analysis, with only a few attempts (e.g., =-=Muthén, 1984-=-; Winship and Mare, 1983; Bollen, 1989, ch. 9) to extend its capabilities to models involving discrete variables, nonlinear dependencies, and heterogeneous effect modifications. 9 A central requiremen... |

74 | Direct and indirect effects
- Pearl
(Show Context)
Citation Context ... These include direct and indirect effects, or “mediation,” a topic with long tradition in social science research, which only recently has been given a satisfactory formulation in nonlinear systems (=-=Pearl, 2001-=-, 2010b). 2 From Association to Causation 2.1 The Basic Distinction and Its Implications The aim of standard statistical analysis, typified by regression, estimation, and hypothesis testing techniques... |

72 |
Causal Ordering and Identifiability
- Simon
- 1953
(Show Context)
Citation Context ... equation models and their associated causal diagrams as the primary language for causal analysis. Influenced by the pioneering work of Sewall Wright (1923) and early econometricians (Haavelmo, 1943; =-=Simon, 1953-=-; Marschak, 1950; Koopmans, 1953), Blalock (1964) and Duncan (1975) considered SEM a mathematical tool for drawing causal conclusions from a combination of observational data and theoretical assumptio... |

71 | The Analysis of Randomized and Non-Randomized AIDS Treatment Trials Using a New Approach to Causal Inference in Longitudinal Studies - ROBINS - 1989 |

68 |
Causal inference, path analysis, and recursive structural equation models
- Holland
- 1988
(Show Context)
Citation Context ...ments upon which the analysis so crucially depends. How do we recognize causal expressions in the social science literature? Those versed in the potential-outcome notation (Neyman, 1923; Rubin, 1974; =-=Holland, 1988-=-; Sobel, 1996) can recognize such expressions through the subscripts that are attached to counterfactual events and variables—for exam2 By “untested” I mean untested using frequency data in nonexperim... |

67 |
Process analysis: Estimating mediation in treatment evaluations
- Judd, Kenny
- 1981
(Show Context)
Citation Context ...trix represents the path coefficients associated with the structural equations. Yet despite its ubiquity, the analysis of mediation has long been a thorny issue in the social and behavioral sciences (=-=Judd and Kenny, 1981-=-; Baron and Kenny, 1986; Muller et al., 2005; Shrout and Bolger, 2002; MacKinnon et al., 2007a) primarily because structural equation modeling in those sciences were deeply entrenched in linear analys... |

66 |
Introduction to structural equation models
- Duncan
- 1975
(Show Context)
Citation Context ... From Linear to Nonparametric Models and Graphs Structural equation modeling (SEM) has been the main vehicle for effect analysis in economics and the behavioral and social sciences (Goldberger, 1972; =-=Duncan, 1975-=-; Bollen, 1989). However, the bulk of SEM methodology was developed for linear analysis, with only a few attempts (e.g., Muthén, 1984; Winship and Mare, 1983; Bollen, 1989, ch. 9) to extend its capabi... |

66 |
Introduction to Econometrics
- Stock, Watson
(Show Context)
Citation Context ...e explain the slow acceptance of causal analysis among health scientists and statisticians, and why most economists and social scientists continue to use structural equation models (Wooldridge, 2002; =-=Stock and Watson, 2003-=-; Heckman, 2008) instead of the potential-outcome alternatives advocated in Angrist et al. (1996); Holland (1988); Sobel (1998); and Sobel (2008). On the other hand, the algebraic machinery offered by... |

65 | Axiomatizing causal reasoning - Halpern |

65 | Using matching, instrumental variables, and control functions to estimate economic choice models. The Review of Economics and Statistics 86: 30–57 - Heckman, Navarro-Lozano |

61 | Probabilistic Causality - Eells - 1991 |

61 |
Probabilistic evaluation of sequential plans from causal models with hidden variables
- Pearl, Robins
- 1995
(Show Context)
Citation Context ...cted to interventions on a single variable; it is applicable to simultaneous or sequential interventions such as those invoked in the analysis of time-varying treatmentwith time-varying confounders (=-=Pearl and Robins, 1995-=-; Arjas and Parner, 2004). For example, if X and Z2 are both treatment variables, and Z1 and Z3 are measured covariates, then the postintervention distribution would be P(z1, z3, y|do(x), do(z2)) = P(... |

60 | Structural equations, treatment effects, and econometric policy evaluation
- Heckman, Vytlacil
- 2005
(Show Context)
Citation Context ... the other hand, the algebraic machinery offered by the counterfactual notation, Yx(u), once a problem is properly formalized, can be extremely powerful in refining assumptions (Angrist et al., 1996; =-=Heckman and Vytlacil, 2005-=-), deriving consistent estimands (Robins, 1986), bounding probabilities of necessary and sufficient causation (Tian and Pearl, 2000), and combining data from experimental and nonexperimental studies (... |

56 | Bounds on Treatment Effects from Studies with Imperfect Compliance
- BALKE, PEARL
- 1997
(Show Context)
Citation Context ...s in their corresponding equations. As an example, consider the model shown in Figure 5, which serves as the canonical representation for the analysis of instrumental variables (Angrist et al., 1996; =-=Balke and Pearl, 1997-=-). This model displays the following parent sets: PA Z = {∅}, PA X = {Z}, PA Y = {X}. (42) Consequently, the exclusion restrictions translate into Xz = Xyz Zy = Zxy = Zx = Z (43) Yx = Yxz, and the abs... |

56 | Recursive causal models - Kiiveri, Speed, et al. - 1984 |

56 | Causal inference from graphical models - Lauritzen - 1999 |

56 | A general identification condition for causal effects
- Tian, Pearl
(Show Context)
Citation Context ..., time-varying treatments), conditional policies, and surrogate experiments were developed in Pearl and Robins (1995), Kuroki and Miyakawa (1999), and Pearl (2000a, chs. 3–4). A more recent analysis (=-=Tian and Pearl, 2002-=-) shows that the key to identifiability lies not in blocking paths between X and Y but rather in blocking paths between X and its immediate successors on the pathways to Y . All existing criteria for ... |

55 |
2007. Counterfactuals and causal inference: Methods and principles for social research. Cambridge Univ Pr
- Morgan, Winship
(Show Context)
Citation Context ... the two effects together and obtain P(y|do(x)) = ∑ P(w3|do(x))P(y|do(w3)). (28) w3 In this example, the variable W3 acts as a “mediating instrumental variable” (Pearl, 1993b; Chalak and White, 2006; =-=Morgan and Winship, 2007-=-). The analysis used in the derivation and validation of such results invokes mathematical rules of transforming causal quantities, represented by expressions such as P(Y = y|do(x)), into do-free expr... |

47 | Principal stratification in causal inference - Frangakis, Rubin - 2002 |