Results 1 
5 of
5
The earth is round (p < .05
 American Psychologist
, 1994
"... After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is ..."
Abstract

Cited by 113 (0 self)
 Add to MetaCart
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences,
Statistical significance testing: a historical overview of misuse and misinterpretation with implication for the editorial policies of educational journals
 Research in the Schools
, 1998
"... Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results; critics have both questioned the logic underlying SSTs and ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results; critics have both questioned the logic underlying SSTs and bemoaned the widespread misapplication and misinterpretation of the results of these tests. The present paper offers a framework for remedying some of the common problems associated with SSTs via modification of journal editorial policies. The controversy surrounding SSTs is overviewed, with attention given to both historical and more contemporary criticisms of bad practices associated with misuse of SSTs. Examples from the editorial policies of Educational and Psychological Measurement and several other journals that have established guidelines for reporting results of SSTs are overviewed, and suggestions are provided regarding additional ways that educational journals may address the problem. Statistical significance testing has existed in some form for approximately 300 years (Huberty, 1993) and has served an important purpose in the advancement of inquiry in the social sciences. However, there has been much controversy over the misuse and misinterpretation of statistical significance testing (Daniel, 1992b).
2 Journal of Economic Perspectives
"... In economics and other sciences, “statistical significance ” is by custom, habit, and education a necessary and sufficient condition for proving an empirical result (Ziliak and McCloskey, 2008; McCloskey and Ziliak, 1996). The canonical routine is to calculate what’s called a tstatistic and then to ..."
Abstract
 Add to MetaCart
In economics and other sciences, “statistical significance ” is by custom, habit, and education a necessary and sufficient condition for proving an empirical result (Ziliak and McCloskey, 2008; McCloskey and Ziliak, 1996). The canonical routine is to calculate what’s called a tstatistic and then to compare its estimated value against a theoretically expected value of it, which is found in “Student’s ” t table. A result yielding a tvalue greater than or equal to about 2.0 is said to be “statistically significant at the 95 percent level. ” Alternatively, a regression coefficient is said to be “statistically significantly different from the null, p �.05.”Canonically speaking, if a coefficient clears the 95 percent hurdle, it warrants additional scientific attention. If not, not. The first presentation of “Student’s”test of significance came a century ago, in “The Probable Error of a Mean ” (1908b), published by an anonymous “Student. ” The author’s commercial employer required that his identity be shielded from competitors, but we have known for some decades that the article was written by William Sealy Gosset (1876–1937), whose entire career was spent at Guinness’s brewery in Dublin, where Gosset was a master brewer and experimental scientist (E. S. Pearson, 1937). Perhaps surprisingly, the ingenious “Student ” did not give a hoot for a single finding of “statistical ” significance, even at the 95 percent level of significance as established by his own tables. Beginning in 1904, “Student, ” who was a businessman besides a scientist, took an economic approach to the logic of uncertainty, arguing finally that statistical significance is “nearly valueless ” in itself (Gosset, 1937, quoted in E. S. Pearson,
2nd Lehmann Symposium Optimality IMS Lecture Notes Mongraphs Series (2006) FREQUENTIST STATISTICS AS A THEORY OF INDUCTIVE INFERENCE
"... After some general remarks about the interrelation between philosophical and statistical thinking, the discussion centres largely on significance tests. These are defined as the calculation of pvalues rather than as formal procedures for ”acceptance ” and ”rejection”. A number of types of null hypo ..."
Abstract
 Add to MetaCart
After some general remarks about the interrelation between philosophical and statistical thinking, the discussion centres largely on significance tests. These are defined as the calculation of pvalues rather than as formal procedures for ”acceptance ” and ”rejection”. A number of types of null hypothesis are described and a principle for evidential interpretation set out governing the implications of pvalues in the specific circumstances of each application, as contrasted with a longrun interpretation. A number of more complicated situations are discussed in which modification of the simple pvalue may be essential. 1. Statistics and inductive philosophy. 1.1. What is the Philosophy of Statistics?. The philosophical foundations of statistics may be regarded as the study of the epistemological, conceptual and logical problems revolving around the use and interpretation of statistical methods, broadly conceived. As with other domains of philosophy of science, work in statistical science progresses largely without worrying about ”philosophical foundations”. Nevertheless, even in statistical practice, debates about the different approaches to statistical analysis may influence and be influenced by general issues of the nature of inductivestatistical inference, and thus are concerned with foundational or philosophical matters. Even those who are largely concerned with applications are often interested in identifying general principles that underlie and justify the procedures they have come to value on relatively pragmatic grounds. At one level of analysis at least, statisticians and philosophers of science ask many of the same questions. • What should be observed and what may justifiably be inferred from the resulting data? • How well do data confirm or fit a model? • What is a good test?