The earth is round (p < .05
 American Psychologist
, 1994
"... After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is ..."
Cited by 346
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences,
Statistical significance testing and cumulative knowledge in psychology: Implications for the training of researchers
 Psychological Methods
, 1996
"... Data analysis methods in psychology still emphasize statistical significance testing, despite numerous articles demonstrating its severe deficiencies. It is now possible to use metaanalysis to show that reliance on significance testing retards the development of cumulative knowledge. But reform of ..."
Cited by 193
Data analysis methods in psychology still emphasize statistical significance testing, despite numerous articles demonstrating its severe deficiencies. It is now possible to use metaanalysis to show that reliance on significance testing retards the development of cumulative knowledge. But reform of teaching and practice will also require that researchers learn that the benefits that they believe flow from use of significance testing are illusory. Teachers must revamp their courses to bring students to understand that (a) reliance on significance testing retards the growth of cumulative research knowledge; (b) benefits widely believed to flow from significance testing do not in fact exist; and (c) significance testing methods must be replaced with point estimates and confidence intervals in individual studies and with metaanalyses in the integration of multiple studies. This reform is essential to the future progress of cumulative knowledge in psychological research. In 1990, Aiken, West, Sechrest, and Reno published an important article surveying the teaching of quantitative methods in graduate psychology programs. They were concerned about what was not being taught or was being inadequately taught to future researchers and the harm this might cause to research progress in psychology. For example, they found that new and important quantitative methods such as causal modeling, confirmatory factor analysis, and metaanalysis were not being taught in the majority of graduate programs. This is indeed a legitimate cause for concern. But in this article, I am concerned about the opposite: An earlier version of this article was presented as the presidential address to the Division of Evaluation,
Personality in scientific and artistic creativity
 In R.J.Sternerg (Ed.), Handbook of human creativity
, 1999
"... Theory and research in both personality psychology and creativity share an essential commonality: emphasis on the uniqueness of the individual. Both disciplines also share an emphasis on temporal consistency and have a 50year history, and yet no quantitative review of the literature on the creative ..."
Cited by 109
Theory and research in both personality psychology and creativity share an essential commonality: emphasis on the uniqueness of the individual. Both disciplines also share an emphasis on temporal consistency and have a 50year history, and yet no quantitative review of the literature on the creative personality has been conducted. The 3 major goals ofthis article are to present the results ofthefirst metaanalytic review ofthe literature on personality and creative achievement, to present a conceptual integration of underlying potential psychological mechanisms that personality and creativity have in common, and to show how the topic ofcreativity has been important to personality psychologists and can be to social psychologists. A common system of personality description was obtained by classifying trait terms or scales onto one of the FiveFactor Model (or Big Five) dimensions: neuroticism, extraversion, openness, agreeableness, and conscientiousness. Effect size was measured using Cohen 's d (Cohen, 1988). Comparisons on personality traits were made on 3 sets ofsamples: scientists versus nonscientists, more creative versus less creative scientists, and artists versus nonartists. In general, creative people are more open to new experiences, less
Null Hypothesis Significance Testing: A Review of an Old and Continuing Controversy
 Psychological Methods
, 2000
"... Null hypothesis significance testing (NHST) is arguably the mosl widely used approach to hypothesis evaluation among behavioral and social scientists. It is also very controversial. A major concern expressed by critics is that such testing is misunderstood by many of those who use it. Several other ..."
Cited by 88
Null hypothesis significance testing (NHST) is arguably the mosl widely used approach to hypothesis evaluation among behavioral and social scientists. It is also very controversial. A major concern expressed by critics is that such testing is misunderstood by many of those who use it. Several other objections to its use have also been raised. In this article the author reviews and comments on the claimed misunderstandings as well as on other criticisms of the approach, and he notes arguments that have been advanced in support of NHST. Alternatives and supplements to NHST are considered, as are several related recommendations regarding the interpretation of experimental data. The concluding opinion is that NHST is easily misunderstood and misused but that when applied with good judgment it can be an effective aid to the interpretation of experimental data. Null hypothesis statistical testing (NHST1) is arguably the most widely used method of analysis of data collected in psychological experiments and has been so for about 70 years. One might think that a method that had been embraced by an entire research community would be well understood and noncontroversial after many decades of constant use. However, NHST is very controversial.2 Criticism of the method, which essentially began with the introduction of the technique (Pearce, 1992), has waxed and waned over the years; it has been intense in the recent past. Apparently, controversy regarding the idea of NHST more generally extends back more than two and a half
Psychology will be a much better science when we change the way we analyze data
 Current Directions in Psychological Science
, 1996
"... because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie de ..."
Cited by 73
because I believed that within it dwelt some of the most fundamental and challenging problems of the extant sciences. Who could not be intrigued, for example, by the relation between consciousness and behavior, or the rules guiding interactions in social situations, or the processes that underlie development from infancy to maturity? Today, in 1996, my fascination with these problems is undiminished. But I've developed a certain angst over the intervening thirtysomething years—a constant, nagging feeling that our field spends a lot of time spinning its wheels without really making all that much progress. This problem shows up in obvious ways—for instance, in the regularity with which findings seem not to replicate. It also shows up in subtler ways—for instance, one doesn't often hear Psychologists saying, "Well this problem is solved now; let's move on to the next one " (as, for example, Johannes Kepler must have said over three centuries ago, after he had cracked the problem of describing planetary motion). I've come to believe that at least part of this problem revolves around our tools—particularly the tools that we use in the critical domains of data analysis and data interpretation. What we do, I sometimes feel, is akin to trying to build a violin using a stone mallet and a chainsaw. The tooltotask fit is not all that good, and as a result, we wind up building a lot of poorquality violins. My purpose here is to elaborate on these issues. In what follows, I will summarize our major dataanalysis and datainterpretation tools, and describe what I believe to be amiss with them. I will then offer some suggestions for change.
The integration of continuous and discrete latent variable models: Potential problems and promising opportunities
 Psychological Methods
, 2004
"... Structural equation mixture modeling (SEMM) integrates continuous and discrete latent variable models. Drawing on prior research on the relationships between continuous and discrete latent variable models, the authors identify 3 conditions that may lead to the estimation of spurious latent classes i ..."
Cited by 48
Structural equation mixture modeling (SEMM) integrates continuous and discrete latent variable models. Drawing on prior research on the relationships between continuous and discrete latent variable models, the authors identify 3 conditions that may lead to the estimation of spurious latent classes in SEMM: misspecification of the structural model, nonnormal continuous measures, and nonlinear relationships among observed and/or latent variables. When the objective of a SEMM analysis is the identification of latent classes, these conditions should be considered as alternative hypotheses and results should be interpreted cautiously. However, armed with greater knowledge about the estimation of SEMMs in practice, researchers can exploit the flexibility of the model to gain a fuller understanding of the phenomenon under study. In recent years, many exciting developments have taken place in structural equation modeling, but perhaps none more so than the development of structural equation models that account for unobserved popula
Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: An integrated alternative method of conducting null hypothesis statistical tests
 Psychological Methods
, 2001
"... Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes a ..."
Cited by 37
Null hypothesis statistical testing (NHST) has been debated extensively but always successfully defended. The technical merits of NHST are not disputed in this article. The widespread misuse of NHST has created a human factors problem that this article intends to ameliorate. This article describes an integrated, alternative inferential confidence interval approach to testing for statistical difference, equivalence, and indeterminacy that is algebraically equivalent to standard NHST procedures and therefore exacts the same evidential standard. The combined numeric and graphic tests of statistical difference, equivalence, and indeterminacy are designed to avoid common interpretive problems associated with NHST procedures. Multiple comparisons, power, sample size, test reliability, effect size, and causeeffect ratio are discussed. A section on the proper interpretation of confidence intervals is followed by a decision rule summary and caveats. The longstanding controversy surrounding null hypothesis statistical testing (NHST) has typically been argued on its technical merits, and they are not dis
The role of constraints in expert memory
 Journal of Experimental Psychology: Learning, Memory, and Cognition
, 2003
"... A great deal of research has been devoted to developing process models of expert memory. However, K. J. Vicente and J. H. Wang (1998) proposed (a) that process theories do not provide an adequate account of expert recall in domains in which memory recall is a contrived task and (b) that a product th ..."
Cited by 24
A great deal of research has been devoted to developing process models of expert memory. However, K. J. Vicente and J. H. Wang (1998) proposed (a) that process theories do not provide an adequate account of expert recall in domains in which memory recall is a contrived task and (b) that a product theory, the constraint attunement hypothesis (CAH), has received a significant amount of empirical support. We compared 1 process theory (the template theory; TT; F. Gobet & H. A. Simon, 1996c) with the CAH in chess. Chess players (N � 36) differing widely in skill levels were required to recall briefly presented chess positions that were randomized in various ways. Consistent with TT, but inconsistent with the CAH, there was a significant skill effect in a condition in which both the location and distribution of the pieces were randomized. These and other results suggest that process models such as TT can provide a viable account of expert memory in chess. Ever since the works of Piaget (1954), Brunswik (1956), and Simon (1969), environment has played an important role in psychological theories. However, although all psychologists agree that cognitive systems adapt to the structure of their environment, there are disagreements about the consequence of this on the optimal
The noncentral chisquare distribution in misspecified structural equation models: Finite sample results from a Monte Carlo simulation
 Multivariate Behavioral Research
, 2002
"... The noncentral chisquare distribution plays a key role in structural equation modeling (SEM). The likelihood ratio test statistic that accompanies virtually all SEMs asymptotically follows a noncentral chisquare under certain assumptions relating to misspecification and multivariate distribution. ..."
Cited by 24
The noncentral chisquare distribution plays a key role in structural equation modeling (SEM). The likelihood ratio test statistic that accompanies virtually all SEMs asymptotically follows a noncentral chisquare under certain assumptions relating to misspecification and multivariate distribution. Many scholars use the noncentral chisquare distribution in the construction of fit indices, such as Steiger and Lind’s (1980) Root Mean Square Error of Approximation (RMSEA) or the family of baseline fit indices (e.g., RNI, CFI), and for the computation of statistical power for model hypothesis testing. Despite this wide use, surprisingly little is known about the extent to which the test statistic follows a noncentral chisquare in applied research. Our study examines several hypotheses about the suitability of the noncentral chisquare distribution for the usual SEM test statistic under conditions commonly encountered in practice. We designed Monte Carlo computer simulation experiments to empirically test these research hypotheses. Our experimental This work was funded in part by grant DA13148 awarded by the National Institute on Drug Abuse to the first two authors. The authors would like to thank Steve Gregorich for several helpful discussions on this topic. Correspondence should be addressed to Patrick
The data analysis dilemma: Ban or abandon. A review of null hypothesis significance testing
 Research in the Schools
, 1998
"... Null Hypothesis Significance Testing (NHST) is reviewed in a historical context. The most vocal criticisms of NHST that have appeared in the literature over the past 50 years are outlined. The authors conclude, based on the criticism of NHST and the alternative methods that have been proposed, that ..."
Cited by 23
Null Hypothesis Significance Testing (NHST) is reviewed in a historical context. The most vocal criticisms of NHST that have appeared in the literature over the past 50 years are outlined. The authors conclude, based on the criticism of NHST and the alternative methods that have been proposed, that viable alternatives to NHST are currently available. The use of effect magnitude measures with surrounding confidence intervals and indications of the reliability of the study are recommended for individual research studies. Advances in the use of metaanalytic techniques provide us with opportunities to advance cumulative knowledge, and all research should be aimed at this goal. The authors provide discussions and references to more information on effect magnitude measures, replication techniques and metaanalytic techniques. A brief situational assessment of the research landscape and strategies for change are offered. It is generally accepted that the purpose of scientific inquiry is to advance the knowledge base of humankind by seeking evidence of a phenomena via valid experiments. In the educational arena, the confirmation of a phenomena should give teachers confidence in their methods and policy makers confidence that their policies will lead to better education for children and adults. We