Results 1  10
of
43
The earth is round (p < .05
 American Psychologist
, 1994
"... After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is ..."
Abstract

Cited by 129 (0 self)
 Add to MetaCart
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences,
Consequences of prejudice against the null hypothesis
 Psychological Bulletin
, 1975
"... The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the researchpublication process and (b) case studies of apparent erroneous rejections of the null hypothesis in published psychological research. The input param ..."
Abstract

Cited by 41 (9 self)
 Add to MetaCart
The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the researchpublication process and (b) case studies of apparent erroneous rejections of the null hypothesis in published psychological research. The input parameters for the model characterize investigators ' probabilities of selecting a problem for which the null hypothesis is true, of reporting, following up on, or abandoning research when data do or do not reject the null hypothesis, and they characterize editors ' probabilities of publishing manuscripts concluding in favor of or against the null hypothesis. With estimates of the input parameters based on a questionnaire survey of a sample of social psychologists, the model output indicates a dysfunctional researchpublication system. Particularly, the model indicates that there may be relatively few publications on problems for which the null hypothesis is (at least to a reasonable approximation) true, and of these, a high proportion will erroneously reject the null hypothesis. The case studies provide additional support for this conclusion. Accordingly, it is
The case against statistical significance testing
 Harvard Educational Review
, 1978
"... In recent years the use of traditional statistical methods in educational research has increasingly come under attack. In this article, Ronald P Carver exposes the fantasies often entertained by researchers about the meaning of statistical significance. The author recommends abandoning all statistic ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
In recent years the use of traditional statistical methods in educational research has increasingly come under attack. In this article, Ronald P Carver exposes the fantasies often entertained by researchers about the meaning of statistical significance. The author recommends abandoning all statistical significance testing and suggests other ways of evaluating research results. Carver concludes that we should return to the scientific method of examining data and replicating results rather than relying on statistical significance testing to provide equivalent information. Statistical significance testing has involved more fantasy than fact. The emphasis on statistical significance over scientific significance in educational research represents a corrupt form of the scientific method. Educational research would be better off if it stopped testing its results for statistical significance. The case against statistical significance testing has been developed by many critics (see Morrison & Henkel, 1970b). For example, after a detailed analysis Bakan (1966) concluded that &quot;the test of statistical significance in psychological research may be taken as an instance of a kind of essential mindlessness in the conduct of research &quot; (p. 436); and as early as 1963
Psychology will be a much better science when we change the way we analyze data
 Current Directions in Psychological Science
, 1996
"... because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie de ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie development from infancy to maturity? Today, in 1996, my fascination with these problems is undiminished. But I've developed a certain angst over the intervening thirtysomething years—a constant, nagging feeling that our field spends a lot of time spinning its wheels without really making all that much progress. This problem shows up in obvious ways—for instance, in the regularity with which findings seem not to replicate. It also shows up in subtler ways—for instance, one doesn't often hear Psychologists saying, "Well this problem is solved now; let's move on to the next one " (as, for example, Johannes Kepler must have said over three centuries ago, after he had cracked the problem of describing planetary motion). I've come to believe that at least part of this problem revolves around our tools—particularly the tools that we use in the critical domains of data analysis and data interpretation. What we do, I sometimes feel, is akin to trying to build a violin using a stone mallet and a chainsaw. The tooltotask fit is not all that good, and as a result, we wind up building a lot of poorquality violins. My purpose here is to elaborate on these issues. In what follows, I will summarize our major dataanalysis and datainterpretation tools, and describe what I believe to be amiss with them. I will then offer some suggestions for change.
Statistical significance testing: a historical overview of misuse and misinterpretation with implication for the editorial policies of educational journals
 Research in the Schools
, 1998
"... Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results; critics have both questioned the logic underlying SSTs and ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results; critics have both questioned the logic underlying SSTs and bemoaned the widespread misapplication and misinterpretation of the results of these tests. The present paper offers a framework for remedying some of the common problems associated with SSTs via modification of journal editorial policies. The controversy surrounding SSTs is overviewed, with attention given to both historical and more contemporary criticisms of bad practices associated with misuse of SSTs. Examples from the editorial policies of Educational and Psychological Measurement and several other journals that have established guidelines for reporting results of SSTs are overviewed, and suggestions are provided regarding additional ways that educational journals may address the problem. Statistical significance testing has existed in some form for approximately 300 years (Huberty, 1993) and has served an important purpose in the advancement of inquiry in the social sciences. However, there has been much controversy over the misuse and misinterpretation of statistical significance testing (Daniel, 1992b).
The data analysis dilemma: Ban or abandon. A review of null hypothesis significance testing
 Research in the Schools
, 1998
"... Null Hypothesis Significance Testing (NHST) is reviewed in a historical context. The most vocal criticisms of NHST that have appeared in the literature over the past 50 years are outlined. The authors conclude, based on the criticism of NHST and the alternative methods that have been proposed, that ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Null Hypothesis Significance Testing (NHST) is reviewed in a historical context. The most vocal criticisms of NHST that have appeared in the literature over the past 50 years are outlined. The authors conclude, based on the criticism of NHST and the alternative methods that have been proposed, that viable alternatives to NHST are currently available. The use of effect magnitude measures with surrounding confidence intervals and indications of the reliability of the study are recommended for individual research studies. Advances in the use of metaanalytic techniques provide us with opportunities to advance cumulative knowledge, and all research should be aimed at this goal. The authors provide discussions and references to more information on effect magnitude measures, replication techniques and metaanalytic techniques. A brief situational assessment of the research landscape and strategies for change are offered. It is generally accepted that the purpose of scientific inquiry is to advance the knowledge base of humankind by seeking evidence of a phenomena via valid experiments. In the educational arena, the confirmation of a phenomena should give teachers confidence in their methods and policy makers confidence that their policies will lead to better education for children and adults. We
Effect sizes and p values: What should be reported . . . ?
, 1996
"... Despite publication of many wellargued critiques of null hypothesis testing (NHT), behavioral science researchers continue to rely heavily on this set of practices. Although we agree with most critics' catalogs of NHT's flaws, this article also takes the unusual stance of identifying vi ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Despite publication of many wellargued critiques of null hypothesis testing (NHT), behavioral science researchers continue to rely heavily on this set of practices. Although we agree with most critics' catalogs of NHT's flaws, this article also takes the unusual stance of identifying virtues that may explain why NHT continues to be so extensively used. These virtues include providing results in the form of a dichotomous (yes/no) hypothesis evaluation and providing an index (p value) that has a justifiable mapping onto confidence in repeatability of a null hypothesis rejection. The mostcriticized flaws of NHT can be avoided when the importance of a hypothesis, rather than the p value of its test, is used to determine that a finding is worthy of report, and when p = .05 is treated as insufficient basis for confidence in the replicability of an isolated nonnull finding. Together with many recent critics of NHT, we also urge reporting of important hypothesis tests in enough descriptive detail to permit secondary uses such as metaanalysis.
Rationality in psychological research: The goodenough principle
 American Psychologist
, 1985
"... ABSTRACT. " This article reexamines a number of methodological and procedural issues raised by Meehl (1967, 1978) that seem to question the rationality of psychological inquiry. The first issue concerns the asymmetry in theory testing between psychology and physics and the resulting paradox tha ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
ABSTRACT. " This article reexamines a number of methodological and procedural issues raised by Meehl (1967, 1978) that seem to question the rationality of psychological inquiry. The first issue concerns the asymmetry in theory testing between psychology and physics and the resulting paradox that, because the psychological null hypothesis is always false, increases in precision in psychology always lead to weaker tests of a theory, whereas the converse is true in physics. The second issue, related to the first, regards the slow progress observed in psychological research and the seeming unwillingness of social scientists to take seriously the Popperian requirements for intellectual honesty. We propose a goodenough principle to resolve Meehl's methodological paradox and appeal
Experimental Comparison of the Comprehensibility of a UMLbased Formal Specification versus a Textual One
 Proceedings of 11 th International Conference on Evaluation and Assessment in Software Engineering (EASE
, 2007
"... The authors wish to acknowledge the support of UK EPSRC, which has funded the ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
The authors wish to acknowledge the support of UK EPSRC, which has funded the
The epistemology of mathematical and statistical modeling: A quiet methodological revolution
 American Psychologist
, 2010
"... A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at leas ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rulebased framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models.