## Causal inference in statistics: An overview

### Cached

### Download Links

Venue: | Statistics Surveys |

Citations: | 23 - 8 self |

### BibTeX

@ARTICLE{Pearl_causalinference,

author = {Judea Pearl},

title = {Causal inference in statistics: An overview},

journal = {Statistics Surveys},

year = {},

pages = {350}

}

### OpenURL

### Abstract

Abstract: This review presents empirical researchers with recent advances in causal inference, and stresses the paradigmatic shifts that must be undertaken in moving from traditional statistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that underly all causal inferences, the languages used in formulating those assumptions, the conditional nature of all causal and counterfactual claims, and the methods that have been developed for the assessment of such claims. These advances are illustrated using a general theory of causation based on the Structural Causal Model (SCM) described in Pearl (2000a), which subsumes and unifies other approaches to causation, and provides a coherent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring (from a combination of data and assumptions) answers to three types of causal queries: (1) queries about the effects of potential interventions, (also called “causal effects ” or “policy evaluation”) (2) queries about probabilities of counterfactuals, (including assessment of “regret, ” “attribution” or “causes of effects”) and (3) queries about direct and indirect effects (also known as “mediation”). Finally, the paper defines the formal and conceptual relationships between the structural and potential-outcome frameworks and presents tools for a symbiosis analysis that uses the strong features of both.

### Citations

7054 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ... social science (Goldberger, 1973; Duncan, 1975), the potentialoutcome framework of Neyman (1923) and Rubin (1974), and the graphical models developed for probabilistic reasoning and causal analysis (=-=Pearl, 1988-=-; Lauritzen, 1996; Spirtes et al., 2000; Pearl, 2000a). Although the basic elements of SCM were introduced in the mid 1990’s (Pearl, 1995a), and have been adapted widely by epidemiologists (Greenland ... |

1118 |
Causality: Models Reasoning and Inference
- Pearl
- 2000
(Show Context)
Citation Context ...e and familiar conceptual framework. 6. Weeding out myths and misconceptions from outdated traditions (Meek and Glymour, 1994; Greenland et al., 1999; Cole and Hernán, 2002; Arah, 2008; Shrier, 2009; =-=Pearl, 2009-=-b). This section provides a gentle introduction to the structural framework and uses it to present the main advances in causal inference that have emerge in the past two decades. 3.1. Introduction to ... |

1102 |
Graphical Models
- LAURITZEN
- 1996
(Show Context)
Citation Context ...ce (Goldberger, 1973; Duncan, 1975), the potentialoutcome framework of Neyman (1923) and Rubin (1974), and the graphical models developed for probabilistic reasoning and causal analysis (Pearl, 1988; =-=Lauritzen, 1996-=-; Spirtes et al., 2000; Pearl, 2000a). Although the basic elements of SCM were introduced in the mid 1990’s (Pearl, 1995a), and have been adapted widely by epidemiologists (Greenland et al., 1999; Gly... |

1069 |
Econometric analysis of cross section and panel data
- Wooldridge
- 2002
(Show Context)
Citation Context ...nships of this type explain the slow acceptance of causal analysis among health scientists and statisticians, and why most economists and social scientists continue to use structural equation models (=-=Wooldridge, 2002-=-; Stock and Watson, 2003; Heckman, 2008) instead of the potential-outcome alternatives advocated in Angrist et al. (1996); Holland (1988); Sobel (1998, 2008). On the other hand, the algebraic machiner... |

1014 |
DB: The central role of the propensity score in observational studies for causal effects. Biometrika
- Rosenbaum, Rubin
- 1983
(Show Context)
Citation Context ...ly ignorable given S,” a formidable mental task required in the imsart-ss ver. 2009/05/21 file: r350.tex date: August 21, 2009J. Pearl/Causal Inference in Statistics 20 potential-response framework (=-=Rosenbaum and Rubin, 1983-=-). The criterion also enables the analyst to search for an optimal set of covariate—namely, a set S that minimizes measurement cost or sampling variability (Tian et al., 1998). All in all, one can saf... |

749 |
Structural Equations with Latent Variables
- Bollen
- 1989
(Show Context)
Citation Context ...o Nonparametric Models and Graphs Structural equation modeling (SEM) has been the main vehicle for effect analysis in economics and the behavioral and social sciences (Goldberger, 1972; Duncan, 1975; =-=Bollen, 1989-=-). However, the bulk of SEM methodology was developed for 7 Additional implications called “dormant independence” (Shpitser and Pearl, 2008) may be deduced from some graphs with correlated errors. ims... |

496 |
Causation, Prediction, and Search
- Spirtes, Glymour, et al.
- 1993
(Show Context)
Citation Context ...ect” as a general capacity to transmit changes among variables. Such an extension, based on simulating hypothetical interventions in the model, was proposed in (Haavelmo, 1943; Strotz and Wold, 1960; =-=Spirtes et al., 1993-=-; Pearl, 1993a, 2000a; Lindley, 2002) and has led to new ways of defining and estimating causal effects in nonlinear and nonparametric models (that is, models in which the functional form of the equat... |

468 |
Estimating causal effects of treatments in randomized and non randomized studies
- Rubin
- 1974
(Show Context)
Citation Context ...y of the judgments upon which the analysis so crucially depends. How does one recognize causal expressions in the statistical literature? Those versed in the potential-outcome notation (Neyman, 1923; =-=Rubin, 1974-=-; Holland, 1988), can recognize such expressions through the subscripts that are attached to counterfactual events and variables, e.g. Yx(u) orZxy. (Some authors use parenthetical expressions, e.g. Y ... |

441 |
Graphical models in applied multivariate statistics
- Whittaker
- 1990
(Show Context)
Citation Context ...fect of X on Y , defined by β Δ = E(Y |do(x0 +1))−E(Y |do(x0)) = δ δ E(Y |do(x)) = δx δx E(Yx). Naturally, all attempts to give β statistical interpretation have ended in frustrations (Holland, 1988; =-=Whittaker, 1990-=-; Wermuth, 1992; Wermuth and Cox, 1993), some persisting well into the 21st century (Sobel, 2008). imsart-ss ver. 2009/05/21 file: r350.tex date: August 21, 2009J. Pearl/Causal Inference in Statistic... |

429 |
Identification of causal effects using instrumental variables
- Angrist, Imbens, et al.
- 1996
(Show Context)
Citation Context ...Sobel (1998, 2008). On the other hand, the algebraic machinery offered by the counterfactual notation, Yx(u), once a problem is properly formalized, can be extremely powerful in refining assumptions (=-=Angrist et al., 1996-=-; Heckman and Vytlacil, 2005), deriving consistent estimands (Robins, 1986), bounding probabilities of necessary 19 Inquisitive readers are invited to guess whether Xz ⊥Z|Y holds in Fig. 2(a), then re... |

417 | Discrete Multivariate Analysis: Theory and Practice - Bishop, Fienberg, et al. - 1975 |

205 | A theory of inferred causation - Pearl, Verma - 1991 |

186 |
Conditional independence in statistical theory
- Dawid
- 1979
(Show Context)
Citation Context ...his translation may in fact be the hardest part of the problem. The 18 The notation Y ⊥X|Z stands for the conditional independence relationship P (Y = y, X = x|Z = z) =P (Y = y|Z = z)P (X = x|Z = z) (=-=Dawid, 1979-=-). imsart-ss ver. 2009/05/21 file: r350.tex date: August 21, 2009J. Pearl/Causal Inference in Statistics 33 reader may appreciate this aspect by attempting to judge whether the assumption of conditio... |

172 | Causal diagrams for empirical research
- Pearl
- 1995
(Show Context)
Citation Context ...ted, or the language in which those assumptions are cast. The structural theory that we use in this survey satisfies the criteria above. It is based on the Structural Causal Model (SCM) developed in (=-=Pearl, 1995-=-a, 2000a) which combines features of the structural equation models (SEM) used in 5 These notational clues should be useful for detecting inadequate definitions of causal concepts; any definition of c... |

153 |
A new approach to causal inference in mortality studies with sustained exposure periods - Application to control of the healthy worker survivor effect
- Robins
- 1986
(Show Context)
Citation Context ...cted to interventions on a single variable; it is applicable to simultaneous or sequential interventions such as those invoked in the analysis of time varying treatment with time varying confounders (=-=Robins, 1986-=-; Arjas and Parner, 2004). For example, if X and Z2 are both treatment variables, and Z1 and Z3 are measured covariates, then the post-intervention distribution would be P (z1,z3,y|do(x),do(z2)) = P (... |

145 |
On the Application of Probability Theory to Agricultural Experiments. Essay on Principles
- Neyman
- 1923
(Show Context)
Citation Context ...the reliability of the judgments upon which the analysis so crucially depends. How does one recognize causal expressions in the statistical literature? Those versed in the potential-outcome notation (=-=Neyman, 1923-=-; Rubin, 1974; Holland, 1988), can recognize such expressions through the subscripts that are attached to counterfactual events and variables, e.g. Yx(u) orZxy. (Some authors use parenthetical express... |

143 | Correlation and Causation - Wright - 1921 |

141 | Mediation in experimental and nonexperimental studies: New procedures and recommendations - Shrout, Bolger - 2002 |

138 | All of Statistics: A Concise Course in Statistical Interference - Wasserman - 2005 |

132 |
JM: Causal diagrams for epidemiologic research. Epidemiology
- Greenland, Pearl, et al.
- 1999
(Show Context)
Citation Context ...earl, 1988; Lauritzen, 1996; Spirtes et al., 2000; Pearl, 2000a). Although the basic elements of SCM were introduced in the mid 1990’s (Pearl, 1995a), and have been adapted widely by epidemiologists (=-=Greenland et al., 1999-=-; Glymour and Greenland, 2008), statisticians (Cox and Wermuth, 2004; Lauritzen, 2001), and social scientists (Morgan and Winship, 2007), its potentials as a comprehensive theory of causation are yet ... |

122 |
Planning of Experiments
- Cox
- 1958
(Show Context)
Citation Context ...he verbal description with which investigators justify assumptions. For example, the assumption that a covariate not be affected by a treatment, a necessary assumption for the control of confounding (=-=Cox, 1958-=-, p. 48), is expressed in plain English, not in a mathematical expression. Remarkably, though the necessity of explicit causal notation is now recognized by many academic scholars, the use of such not... |

117 | Recent developments in the econometrics of program evaluation
- Imbens, Wooldridge
- 2009
(Show Context)
Citation Context ... standard in epidemiology research (Robins, 2001; Petersen et al., 2006; VanderWeele and Robins, 2007; Hafeman and Schwartz, 2009; VanderWeele, 2009) yet still lacking in econometrics (Heckman, 2008; =-=Imbens and Wooldridge, 2009-=-). imsart-ss ver. 2009/05/21 file: r350.tex date: August 21, 2009J. Pearl/Causal Inference in Statistics 35 1. Exclusion restrictions: For every variable Y having parents PA Y and for every set of en... |

115 |
A Probabilistic Theory of Causality
- Suppes
- 1970
(Show Context)
Citation Context ... The unification of the graphical, potential outcome, structural equations, decision analytical (Dawid, 2002), interventional (Woodward, 2003), sufficient component (Rothman, 1976) and probabilistic (=-=Suppes, 1970-=-) approaches to causation; with each approach viewed as a restricted version of the SCM. 2. The definition, axiomatization and algorithmization of counterfactuals and joint probabilities of counterfac... |

113 | Nonparametric bounds on treatment effects - Manski - 1990 |

104 | The interpretation of interaction in contingency tables - Simpson - 1951 |

95 | Comment: Graphical models, causality, and intervention
- Pearl
- 1993
(Show Context)
Citation Context ...city to transmit changes among variables. Such an extension, based on simulating hypothetical interventions in the model, was proposed in (Haavelmo, 1943; Strotz and Wold, 1960; Spirtes et al., 1993; =-=Pearl, 1993-=-a, 2000a; Lindley, 2002) and has led to new ways of defining and estimating causal effects in nonlinear and nonparametric models (that is, models in which the functional form of the equations is unkno... |

95 | Identifiability and exchangeability for direct and indirect effects - Robins, Greenland - 1992 |

93 |
The Statistical Implications of a System of Simultaneous Equations
- Haavelmo
- 1943
(Show Context)
Citation Context ...cient in an equation, and redefine “effect” as a general capacity to transmit changes among variables. Such an extension, based on simulating hypothetical interventions in the model, was proposed in (=-=Haavelmo, 1943-=-; Strotz and Wold, 1960; Spirtes et al., 1993; Pearl, 1993a, 2000a; Lindley, 2002) and has led to new ways of defining and estimating causal effects in nonlinear and nonparametric models (that is, mod... |

85 |
Causal Inference without Counterfactuals (with Discussion
- Dawid
- 2000
(Show Context)
Citation Context ...ty to the joint statement “Y would be y if X = x and Y would be y ′ if X = x ′.” 14 Such concerns have been a source of objections to treating counterfactuals as jointly distributed random variables (=-=Dawid, 2000-=-). The definition of Yx and Yx ′ in terms of two distinct submodels neutralizes these objections (Pearl, 2000b), since the contradictory joint statement is mapped into an ordinary event, one where the... |

74 | Direct and indirect effects
- Pearl
(Show Context)
Citation Context ...rom x to x ′ is defined as the expected change in Y affected by holding X constant, at X = x, and changing Z to whatever value it would have attained had X been set to X = x ′ . Formally, this reads (=-=Pearl, 2001-=-): IEx,x ′(Y ) Δ = E((Yx,Z x ′ ) − E(Yx)), (49) which is almost identical to the direct effect (Eq. (47)) save for exchanging x and x ′ . Indeed, it can be shown that, in general, the total effect TE ... |

72 |
Causal Ordering and Identifiability
- Simon
- 1953
(Show Context)
Citation Context ...remain constant. A system of such functions are said to be structural if they are assumed to be autonomous, that is, each function is invariant to possible changes in the form of the other functions (=-=Simon, 1953-=-; Koopmans, 1953). 3.2.1. Representing interventions This feature of invariance permits us to use structural equations as a basis for modeling causal effects and counterfactuals. This is done through ... |

72 | Causal diagrams for empirical research. Biometrika 82(4):669–710 - Pearl - 1995 |

71 | The Analysis of Randomized and Non-Randomized AIDS Treatment Trials Using a New Approach to Causal Inference in Longitudinal Studies - ROBINS - 1989 |

68 |
Causal inference, path analysis, and recursive structural equation models
- Holland
- 1988
(Show Context)
Citation Context ...ments upon which the analysis so crucially depends. How does one recognize causal expressions in the statistical literature? Those versed in the potential-outcome notation (Neyman, 1923; Rubin, 1974; =-=Holland, 1988-=-), can recognize such expressions through the subscripts that are attached to counterfactual events and variables, e.g. Yx(u) orZxy. (Some authors use parenthetical expressions, e.g. Y (0), Y (1), Y (... |

66 |
Introduction to structural equation models
- Duncan
- 1975
(Show Context)
Citation Context ...rators, can safely be discarded as inadequate. imsart-ss ver. 2009/05/21 file: r350.tex date: August 21, 2009J. Pearl/Causal Inference in Statistics 7 economics and social science (Goldberger, 1973; =-=Duncan, 1975-=-), the potentialoutcome framework of Neyman (1923) and Rubin (1974), and the graphical models developed for probabilistic reasoning and causal analysis (Pearl, 1988; Lauritzen, 1996; Spirtes et al., 2... |

66 |
Introduction to Econometrics
- Stock, Watson
(Show Context)
Citation Context ...e explain the slow acceptance of causal analysis among health scientists and statisticians, and why most economists and social scientists continue to use structural equation models (Wooldridge, 2002; =-=Stock and Watson, 2003-=-; Heckman, 2008) instead of the potential-outcome alternatives advocated in Angrist et al. (1996); Holland (1988); Sobel (1998, 2008). On the other hand, the algebraic machinery offered by the counter... |

65 | Using matching, instrumental variables, and control functions to estimate economic choice models. The Review of Economics and Statistics 86: 30–57
- Heckman, Navarro-Lozano
(Show Context)
Citation Context ...n (2009) goes as far as stating that refraining from conditioning on an available measurement is “nonscientific ad hockery” for it goes against the tenets of Bayesian philosophy (see (Pearl, 2009b,c; =-=Heckman and Navarro-Lozano, 2004-=-) for a discussion of this fallacy). imsart-ss ver. 2009/05/21 file: r350.tex date: August 21, 2009J. Pearl/Causal Inference in Statistics 34 and sufficient causation (Tian and Pearl, 2000), and comb... |

65 | Influence diagrams for causal modelling and inference - Dawid |

61 |
Probabilistic Causality
- Eells
- 1991
(Show Context)
Citation Context ...-level causes (e.g., “Drinking hemlock causes death”) and singular or unit-level causes (e.g., “Socrates’ drinking hemlock caused his death”), which many philosophers have regarded as irreconcilable (=-=Eells, 1991-=-), introduces no tension at all in the structural theory. The two types of sentences differ merely in the level of situation-specific information that is brought to bear on a problem, that is, in the ... |

61 | Probabilistic evaluation of sequential plans from causal models with hidden variables - Pearl, Robins - 1995 |

60 | Structural equations, treatment effects, and econometric policy evaluation
- Heckman, Vytlacil
- 2005
(Show Context)
Citation Context ... the other hand, the algebraic machinery offered by the counterfactual notation, Yx(u), once a problem is properly formalized, can be extremely powerful in refining assumptions (Angrist et al., 1996; =-=Heckman and Vytlacil, 2005-=-), deriving consistent estimands (Robins, 1986), bounding probabilities of necessary 19 Inquisitive readers are invited to guess whether Xz ⊥Z|Y holds in Fig. 2(a), then reflect on why causality is so... |

57 |
Instrumental Variables
- Bowden, Turkington
- 1984
(Show Context)
Citation Context ... this equation by z and taking expectations, gives β = Cov(Z, Y )/(Cov(Z, X) (31) which reduces β to correlations among observed measurements. Eq. (31) is known as the instrumental variable estimand (=-=Bowden and Turkington, 1984-=-). Similarly, Angrist and Imbens (1991) have shown that a broader class of nonlinear functions fX and fY may render the causal effect identifiable. Angrist et al. (1996) and Heckman and Vytlacil (2005... |

56 | Bounds on Treatment Effects from Studies with Imperfect Compliance
- BALKE, PEARL
- 1997
(Show Context)
Citation Context ...ding bias cannot be removed by adjustment. Moreover, it can be shown that, in the absence of additional assumptions, the treatment effect in such graphs cannot be identified by any method whatsoever (=-=Balke and Pearl, 1997-=-); one must therefore resort to approximate methods of assessment. It is interesting to note that it is our insistence on allowing arbitrary functions in Eq. (5) that curtails our ability to infer the... |

56 | Recursive causal models - Kiiveri, Speed, et al. - 1984 |

56 | Causal inference from graphical models
- Lauritzen
- 1999
(Show Context)
Citation Context ...s of SCM were introduced in the mid 1990’s (Pearl, 1995a), and have been adapted widely by epidemiologists (Greenland et al., 1999; Glymour and Greenland, 2008), statisticians (Cox and Wermuth, 2004; =-=Lauritzen, 2001-=-), and social scientists (Morgan and Winship, 2007), its potentials as a comprehensive theory of causation are yet to be fully utilized. Its ramifications thus far include: 1. The unification of the g... |

56 | A general identification condition for causal effects
- Tian, Pearl
(Show Context)
Citation Context ...., time varying treatments), conditional policies, and surrogate experiments were developed in Pearl and Robins (1995), Kuroki and Miyakawa (1999), and Pearl (2000a, Chapters 3–4). A recent analysis (=-=Tian and Pearl, 2002-=-) shows that the key to identifiability lies not in blocking paths between X and Y but, rather, in blocking paths between X and its immediate successors on the pathways to Y . All existing criteria fo... |

55 |
2007. Counterfactuals and causal inference: Methods and principles for social research. Cambridge Univ Pr
- Morgan, Winship
(Show Context)
Citation Context ...s (Pearl, 1995a), and have been adapted widely by epidemiologists (Greenland et al., 1999; Glymour and Greenland, 2008), statisticians (Cox and Wermuth, 2004; Lauritzen, 2001), and social scientists (=-=Morgan and Winship, 2007-=-), its potentials as a comprehensive theory of causation are yet to be fully utilized. Its ramifications thus far include: 1. The unification of the graphical, potential outcome, structural equations,... |

48 | A graphical approach to the identification and estimation of causal parameters in mortality studies with sustained exposure periods - Robins - 1987 |

47 |
Theory of Probability: A Critical Introductory Treatment
- Finetti, Machi, et al.
- 1993
(Show Context)
Citation Context ...s causation, because the concept, although widely used, does not seem to be well-defined” (p. 51). Instead, they attribute the paradox to another untestable relationship in the story—exchangeability (=-=DeFinetti, 1974-=-) which is cognitively formidable yet, at least formally, can be cast as a property of some imaginary probability function. The same fear of extending the boundaries of probability language can be fou... |

47 | Principal stratification in causal inference - Frangakis, Rubin - 2002 |