Using confidence intervals in withinsubject designs
 Psychonomic Bulletin & Review
, 1994
"... Wolford, and two anonymous reviewers for very useful comments on earlier drafts of the manuscript. Correspondence may be addressed to ..."
Wolford, and two anonymous reviewers for very useful comments on earlier drafts of the manuscript. Correspondence may be addressed to
The earth is round (p < .05
 American Psychologist
, 1994
"... After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is ..."
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences,
Consequences of prejudice against the null hypothesis
 Psychological Bulletin
, 1975
"... The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the researchpublication process and (b) case studies of apparent erroneous rejections of the null hypothesis in published psychological research. The input param ..."
The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the researchpublication process and (b) case studies of apparent erroneous rejections of the null hypothesis in published psychological research. The input parameters for the model characterize investigators ' probabilities of selecting a problem for which the null hypothesis is true, of reporting, following up on, or abandoning research when data do or do not reject the null hypothesis, and they characterize editors ' probabilities of publishing manuscripts concluding in favor of or against the null hypothesis. With estimates of the input parameters based on a questionnaire survey of a sample of social psychologists, the model output indicates a dysfunctional researchpublication system. Particularly, the model indicates that there may be relatively few publications on problems for which the null hypothesis is (at least to a reasonable approximation) true, and of these, a high proportion will erroneously reject the null hypothesis. The case studies provide additional support for this conclusion. Accordingly, it is
The case against statistical significance testing
 Harvard Educational Review
, 1978
"... In recent years the use of traditional statistical methods in educational research has increasingly come under attack. In this article, Ronald P Carver exposes the fantasies often entertained by researchers about the meaning of statistical significance. The author recommends abandoning all statistic ..."
In recent years the use of traditional statistical methods in educational research has increasingly come under attack. In this article, Ronald P Carver exposes the fantasies often entertained by researchers about the meaning of statistical significance. The author recommends abandoning all statistical significance testing and suggests other ways of evaluating research results. Carver concludes that we should return to the scientific method of examining data and replicating results rather than relying on statistical significance testing to provide equivalent information. Statistical significance testing has involved more fantasy than fact. The emphasis on statistical significance over scientific significance in educational research represents a corrupt form of the scientific method. Educational research would be better off if it stopped testing its results for statistical significance. The case against statistical significance testing has been developed by many critics (see Morrison & Henkel, 1970b). For example, after a detailed analysis Bakan (1966) concluded that "the test of statistical significance in psychological research may be taken as an instance of a kind of essential mindlessness in the conduct of research " (p. 436); and as early as 1963
Under What Conditions Does Theory Obstruct Research Progress?
 PSYCHOLOGICAL REVIEW
, 1986
"... ..."
Psychology will be a much better science when we change the way we analyze data
 Current Directions in Psychological Science
, 1996
"... because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie de ..."
because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie development from infancy to maturity? Today, in 1996, my fascination with these problems is undiminished. But I've developed a certain angst over the intervening thirtysomething years—a constant, nagging feeling that our field spends a lot of time spinning its wheels without really making all that much progress. This problem shows up in obvious ways—for instance, in the regularity with which findings seem not to replicate. It also shows up in subtler ways—for instance, one doesn't often hear Psychologists saying, "Well this problem is solved now; let's move on to the next one " (as, for example, Johannes Kepler must have said over three centuries ago, after he had cracked the problem of describing planetary motion). I've come to believe that at least part of this problem revolves around our tools—particularly the tools that we use in the critical domains of data analysis and data interpretation. What we do, I sometimes feel, is akin to trying to build a violin using a stone mallet and a chainsaw. The tooltotask fit is not all that good, and as a result, we wind up building a lot of poorquality violins. My purpose here is to elaborate on these issues. In what follows, I will summarize our major dataanalysis and datainterpretation tools, and describe what I believe to be amiss with them. I will then offer some suggestions for change.
Multivariate Discretization for Set Mining
 KNOWLEDGE AND INFORMATION SYSTEMS
, 2000
"... Many algorithms in data mining can be formulated as a set mining problem where the goal is to find conjunctions (or disjunctions) of terms that meet user specified constraints. Set mining techniques have been largely designed for categorical or discrete data where variables can only take on a fixed ..."
Many algorithms in data mining can be formulated as a set mining problem where the goal is to find conjunctions (or disjunctions) of terms that meet user specified constraints. Set mining techniques have been largely designed for categorical or discrete data where variables can only take on a fixed number of values. However, many data sets also contain continuous variables and a common method of dealing with these is to discretize them by breaking them into ranges. Most discretization methods are univariate and consider only a single feature at a time (sometimes in conjunction with a class variable). We argue that this is a suboptimal approach for knowledge discovery as univariate discretization can destroy hidden patterns in data. Discretization should consider the effects on all variables in the analysis and that two regions X and Y should only be in the same interval after discretization if the instances in those regions have similar multivariate distributions (Fx Fy) across all variables and combinations of variables. We present a bottom up merging algorithm to discretize continuous variables based on this rule. Our experiments indicate that the approach is feasible, that it will not destroy hidden patterns and that it will generate meaningful intervals.
Multivariate discretization of continuous variables for set mining
 In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2000
"... Many algorithms in data mining can be formulated as a set mining problem where the goal is to nd conjunctions (or disjunctions) of terms that meet user speci ed constraints. Set mining techniques have been largely designed for categorical or discrete data where variables can only take on a xed numbe ..."
Many algorithms in data mining can be formulated as a set mining problem where the goal is to nd conjunctions (or disjunctions) of terms that meet user speci ed constraints. Set mining techniques have been largely designed for categorical or discrete data where variables can only take on a xed numberofvalues. However, many data sets also contain continuous variables and a common method of dealing with these is to discretize them by breaking them into ranges. Most discretization methods are univariate and consider only a single feature at a time (sometimes in conjunction with the class variable). We argue that this is a suboptimal approach for knowledge discovery as univariate discretization can destroy hidden patterns in data. Discretization should consider the e ects on all variables in the analysis and that two regions X and Y should only be in the same cell after discretization if the instances in those regions have similar multivariate distributions (Fx Fy) across all variables and combinations of variables. We present abottom up merging algorithm to discretize continuous variables based on this rule. Our experiments indicate that the approach is feasible, that it does not destroy hidden patterns and that it generates meaningful intervals.
Misinterpretations of Significance: A Problem Students Share with Their Teachers?
"... The use of significance tests in science has been debated from the invention of these tests until the present time. Apart from theoretical critiques on their appropriateness for evaluating scientific hypotheses, significance tests also receive criticism for inviting misinterpretations. We presented ..."
The use of significance tests in science has been debated from the invention of these tests until the present time. Apart from theoretical critiques on their appropriateness for evaluating scientific hypotheses, significance tests also receive criticism for inviting misinterpretations. We presented six common misinterpretations to psychologists who work in German universities and found out that they are still surprisingly widespread – even among instructors who teach statistics to psychology students. Although these misinterpretations are well documented among students, until now there has been little research on pedagogical methods to remove them. Rather, they are considered “hard facts ” that are impervious to correction. We discuss the roots of these misinterpretations and propose a pedagogical concept to teach significance tests, which involves explaining the meaning of statistical significance in an appropriate way. 1.
Statistical significance testing: a historical overview of misuse and misinterpretation with implication for the editorial policies of educational journals
 Research in the Schools
, 1998
"... Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results; critics have both questioned the logic underlying SSTs and ..."
Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results; critics have both questioned the logic underlying SSTs and bemoaned the widespread misapplication and misinterpretation of the results of these tests. The present paper offers a framework for remedying some of the common problems associated with SSTs via modification of journal editorial policies. The controversy surrounding SSTs is overviewed, with attention given to both historical and more contemporary criticisms of bad practices associated with misuse of SSTs. Examples from the editorial policies of Educational and Psychological Measurement and several other journals that have established guidelines for reporting results of SSTs are overviewed, and suggestions are provided regarding additional ways that educational journals may address the problem. Statistical significance testing has existed in some form for approximately 300 years (Huberty, 1993) and has served an important purpose in the advancement of inquiry in the social sciences. However, there has been much controversy over the misuse and misinterpretation of statistical significance testing (Daniel, 1992b).