Results 1  10
of
14
From association to causation: Some remarks on the history of statistics
 Statist. Sci
, 1999
"... The “numerical method ” in medicine goes back to Pierre Louis ’ study of pneumonia (1835), and John Snow’s book on the epidemiology of cholera (1855). Snow took advantage of natural experiments and used convergent lines of evidence to demonstrate that cholera is a waterborne infectious disease. More ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
The “numerical method ” in medicine goes back to Pierre Louis ’ study of pneumonia (1835), and John Snow’s book on the epidemiology of cholera (1855). Snow took advantage of natural experiments and used convergent lines of evidence to demonstrate that cholera is a waterborne infectious disease. More recently, investigators in the social and life sciences have used statistical models and significance tests to deduce causeandeffect relationships from patterns of association; an early example is Yule’s study on the causes of poverty (1899). In my view, this modeling enterprise has not been successful. Investigators tend to neglect the difficulties in establishing causal relations, and the mathematical complexities obscure rather than clarify the assumptions on which the analysis is based. Formal statistical inference is, by its nature, conditional. If maintained hypotheses A, B, C,... hold, then H can be tested against the data. However, if A, B, C,... remain in doubt, so must inferences about H. Careful scrutiny of maintained hypotheses should therefore be a critical part of empirical work—a principle honored more often in the breach than the observance. Snow’s work on cholera will be contrasted with modern studies that depend on statistical models and tests of significance. The examples may help to clarify the limits of current statistical techniques for making causal inferences from patterns of association. 1.
On regression adjustments to experimental data
 In press, Advances in Applied Mathematics. http://www.stat.berkeley.edu/users/census/neyregr.pdf
, 2007
"... Regression adjustments are often made to experimental data. Since randomization does not justify the models, almost anything can happen. Here, we evaluate results using Neyman’s nonparametric model, where each subject has two potential responses, one if treated and the other if untreated. Only one ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
Regression adjustments are often made to experimental data. Since randomization does not justify the models, almost anything can happen. Here, we evaluate results using Neyman’s nonparametric model, where each subject has two potential responses, one if treated and the other if untreated. Only one of the two responses is observed. Regression estimates are generally biased, but the bias is small with large samples. Adjustment may improve precision, or make precision worse; standard errors computed according to usual procedures may overstate the precision, or understate, by quite large factors. Asymptotic expansions make these ideas more precise.
On specifying graphical models for causation, and the identification problem
 Evaluation Review
, 2004
"... This paper (which is mainly expository) sets up graphical models for causation, having a bit less than the usual complement of hypothetical counterfactuals. Assuming the invariance of error distributions may be essential for causal inference, but the errors themselves need not be invariant. Graphs c ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
This paper (which is mainly expository) sets up graphical models for causation, having a bit less than the usual complement of hypothetical counterfactuals. Assuming the invariance of error distributions may be essential for causal inference, but the errors themselves need not be invariant. Graphs can be interpreted using conditional distributions, so that we can better address connections between the mathematical framework and causality in the world. The identification problem is posed in terms of conditionals. As will be seen, causal relationships cannot be inferred from a data set by running regressions unless there is substantial prior knowledge about the mechanisms that generated the data. There are few successful applications of graphical models, mainly because few causal pathways can be excluded on a priori grounds. The invariance conditions themselves remain to be assessed.
On regression adjustments in experiments with several treatments
"... Regression adjustments are often made to experimental data. Since randomization does not justify the models, bias is likely; nor are the usual variance calculations to be trusted. Here, we evaluate regression adjustments using Neyman’s nonparametric model. Previous results are generalized, and more ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Regression adjustments are often made to experimental data. Since randomization does not justify the models, bias is likely; nor are the usual variance calculations to be trusted. Here, we evaluate regression adjustments using Neyman’s nonparametric model. Previous results are generalized, and more intuitive proofs are given. A bias term is isolated, and conditions are given for unbiased estimation in finite samples. 1. Introduction. Data
Statistical Models for Causation: What Inferential Leverage Do They Provide?” Evaluation Review, 30, 691–713. http://www.stat.berkeley.edu/users/census/oxcauser.pdf
 2008a). “Diagnostics Cannot Have Much Power Against General Alternatives.” http://www.stat.berkeley.edu/users/census/notest.pdf Freedman, D. A. (2008b). “Randomization Does Not Justify Logistic Regression.” http://www.stat.berkeley.edu/users/census/neylog
, 2006
"... Experiments offer more reliable evidence on causation than observational studies, which is not to gainsay the contribution to knowledge from observation. Experiments should be analyzed as experiments, not as observational studies. A simple comparison of rates might be just the right tool, with littl ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
Experiments offer more reliable evidence on causation than observational studies, which is not to gainsay the contribution to knowledge from observation. Experiments should be analyzed as experiments, not as observational studies. A simple comparison of rates might be just the right tool, with little value added by “sophisticated” models. This article discusses current models for causation, as applied to experimental and observational data. The intentiontotreat principle and the effect of treatment on the treated will also be discussed. Flaws in perprotocol and treatmentreceived estimates will be demonstrated.
Randomization does not justify logistic regression
 ADVANCES IN APPLIED MATHEMATICS
, 2008
"... Logit models are often used to analyze experimental data. However, randomization does not justify the model, and estimators may be inconsistent. Here, Neyman’s nonparametric setup is used as a benchmark. Each subject has two potential responses, one if treated and the other if untreated; only one o ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Logit models are often used to analyze experimental data. However, randomization does not justify the model, and estimators may be inconsistent. Here, Neyman’s nonparametric setup is used as a benchmark. Each subject has two potential responses, one if treated and the other if untreated; only one of the two responses is observed. A consistent estimator is proposed for use with the logit model. There is a brief literature review, and some recommendations for practice.
The swine flu vaccine and GuillainBarré syndrome: a case study in relative risk and specific causation
 Evaluation Review
, 1999
"... Epidemiologic methods were developed to prove general causation: identifying exposures that increase the risk of particular diseases. Courts often are more interested in specific causation: on balance of probabilities, was the plainti#'s disease caused by exposure to the agent in quest ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
<F4.554e+05> Epidemiologic methods were developed to prove general causation: identifying exposures that increase the risk of particular diseases. Courts often are more interested in specific causation: on balance of probabilities, was the plainti#'s disease caused by exposure to the agent in question? Some authorities have suggested that a relative risk greater than 2.0 meets the standard of proof for specific causation. Such a definite criterion is appealing, but there are di#culties. Bias and confounding are familiar problems; individual di#erences must be considered too. The issues are explored in the context of the swine flu vaccine and GuillainBarre syndrome. The conclusion: there is a considerable gap between relative risks and proof of specific causation.<F4.051e+05> 1. Introduction<F4.554e+05> In a toxic tort case, the plainti# is exposed to a toxic agent, su#ers injury, and sues. To win, the plainti# must prove (i) "general causation" (the agent is capable of producing th...
Identification and likelihood inference for recursive linear models with correlated errors
, 2007
"... In recursive linear models, the multivariate normal joint distribution of all variables exhibits a dependence structure induced by recursive systems of linear structural equations. Such models appear in particular in seemingly unrelated regressions, structural equation modelling, simultaneous equati ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In recursive linear models, the multivariate normal joint distribution of all variables exhibits a dependence structure induced by recursive systems of linear structural equations. Such models appear in particular in seemingly unrelated regressions, structural equation modelling, simultaneous equation systems, and in Gaussian graphical modelling. We show that recursive linear models that are ‘bowfree’ are wellbehaved statistical models, namely, they are everywhere identifiable and form curved exponential families. Here, ‘bowfree ’ refers to models satisfying the condition that if a variable x occurs in the structural equation for y, then the errors for x and y are uncorrelated. For the computation of maximum likelihood estimates in ‘bowfree ’ recursive linear models we introduce the Residual Iterative Conditional Fitting (RICF) algorithm. Compared to existing algorithms RICF is easily implemented requiring only least squares computations, has clear convergence properties, and finds parameter estimates in closed form whenever possible. 1
Statistical Models for Causation
, 2005
"... We review the basis for inferring causation by statistical modeling. Parameters should be stable under interventions, and so should error distributions. There are also statistical conditions on the errors. Stability is difficult to establish a priori, and the statistical conditions are equally probl ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We review the basis for inferring causation by statistical modeling. Parameters should be stable under interventions, and so should error distributions. There are also statistical conditions on the errors. Stability is difficult to establish a priori, and the statistical conditions are equally problematic. Therefore, causal relationships are seldom to be inferred from a data set by running statistical algorithms, unless there is substantial prior knowledge about the mechanisms that generated the data. We begin with linear models (regression analysis) and then turn to graphical models, which may in principle be nonlinear.
The Salience of Ethnic Categories: Field and Natural Experimental Evidence from Indian Village Councils
, 2011
"... collaborators at Bangalore University, and especially to Dr. B.S. Padmavathi of the international Academy for Creative Teaching (iACT) for assistance with fieldwork. Janhavi Nilekani and Rishabh Khosla of Yale College provided superb research assistance. Previous versions of this paper were presente ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
collaborators at Bangalore University, and especially to Dr. B.S. Padmavathi of the international Academy for Creative Teaching (iACT) for assistance with fieldwork. Janhavi Nilekani and Rishabh Khosla of Yale College provided superb research assistance. Previous versions of this paper were presented at Yale, Princeton, and the annual meetings of the Society for Political Methodology. I received helpful advice and comments from seminar participants and from