Some Practical Guidelines for Effective SampleSize Determination
, 2001
Samplesize determination is often an important step in planning a statistical studyand it is usually a difficult one. Among the important hurdles to be surpassed, one must obtain an estimate of one or more error variances, and specify an effect size of importance. There is the temptation to take some shortcuts. This paper offers some suggestions for successful and meaningful samplesize determination. Also discussed is the possibility that sample size may not be the main issue, that the real goal is to design a highquality study. Finally, criticism is made of some illadvised shortcuts relating to power and sample size. Key words: Power; Sample size; Observed power; Retrospective power; Study design; Cohen's effect measures; Equivalence testing; # I wish to thank Kate Cowles, John Castelloe, Steve Simon, two referees, an editor, and an associate editor for their helpful comments on earlier drafts of this paper. Much of this work was done with the support of the Obermann ...
The scientific status of projective techniques
 Psychological Science in the Public Interest
, 2001
Abstract—Although projective techniques continue to be widely used in clinical and forensic settings, their scientific status remains highly controversial. In this monograph, we review the current state of the literature concerning the psychometric properties (norms, reliability, validity, incremental validity, treatment utility) of three major projective instruments: Rorschach Inkblot Test, Thematic Apperception Test (TAT), and human figure drawings. We conclude that there is empirical support for the validity of a small number of indexes derived from the Rorschach and TAT. However, the substantial majority of Rorschach and TAT indexes are not empirically supported. The validity evidence for human figure drawings is even more limited. With a few exceptions, projective indexes have not consistently demonstrated incremental validity above and beyond other psychometric data. In addition, we summarize
Effects of Field of View on Performance with HeadMounted Displays
, 2000
The field of view (FOV) in most headmounted displays (HMDs) is no more than 60 degrees wide  far narrower than our normal FOV of about 200 wide. This mismatch arises mostly from the difficulty and expense of building wideFOV HMDs. Restricting a person's FOV, however, has been shown in real environments to affect people's behavior and degrade task performance. Previous work in virtual reality too has shown that restricting FOV to 50 or less in an HMD can degrade performance. I conducted experiments with a custom, wideFOV HMD and found that performance is degraded even at the relatively high FOV of 112, and further at 48. The experiments used a prototype tiled wideFOV HMD to measure performance in VR at up to 176 total horizontal FOV, and a custom largearea tracking system to establish new findings on performance while walking about a large virtua...
The persistence of underpowered studies in psychological research: Causes, consequences, and remedies
 Psychological Methods
, 2004
Underpowered studies persist in the psychological literature. This article examines reasons for their persistence and the effects on efforts to create a cumulative science. The “curse of multiplicities ” plays a central role in the presentation. Most psychologists realize that testing multiple hypotheses in a single study affects the Type I error rate, but corresponding implications for power have largely been ignored. The presence of multiple hypothesis tests leads to 3 different conceptualizations of power. Implications of these 3 conceptualizations are discussed from the perspective of the individual researcher and from the perspective of developing a coherent literature. Supplementing significance tests with effect size measures and confidence intervals is shown to address some but not necessarily all problems associated with multiple testing. The primary purpose of this article is to examine the importance of statistical power for the formulation of a coherent body of scientific literature. The article addresses this goal through consideration of four interrelated subtopics: (a) why underpowered studies persist, (b) the undesir
The power of statistical tests in metaanalysis
 Psychological Methods
, 2001
Calculations of the power of statistical tests are important in planning research studies (including metaanalyses) and in interpreting situations in which a result has not proven to be statistically significant. The authors describe procedures to compute statistical power of fixed and randomeffects tests of the mean effect size, tests for heterogeneity (or variation) of effect size parameters across studies, and tests for contrasts among effect sizes of different studies. Examples are given using 2 published metaanalyses. The examples illustrate that statistical power is not always high in metaanalysis. The use of quantitative methods to summarize the results of several empirical research studies, or metaanalysis, is now widespread in psychology, medicine, and the social sciences. Metaanalysis involves describing the results of each study using a numerical index (an estimate of effect size such as a correlation coefficient, a standardized mean difference, or an odds ratio) and then combining these estimates across studies to obtain a summary. Although inference procedures for metaanalysis have been available for well over a decade, there is little work on the calculation of the power of statistical tests in metaanalysis. However, power calculations are always part of sound statistical planning (Cohen, 1977). Moreover, power calculations are often a required component of research grant proposals in primary research, and the requirement of providing some estimate of statistical power is increasingly an issue in evaluating research synthesis projects as well. Although metaanalyses with large numbers of studies investigating even mediumsized effects may have quite powerful tests, metaanalyses of smaller numbers of studies and metaanalyses in areas in which effects are expected to be small do not necessarily have very powerful statistical tests. The purpose of this article is to provide procedures
2003c) The European Smoking Prevention Framework Approach (EFSA): an example of integral prevention
 Health Education Research
The European Smoking Prevention Framework Approach (ESFA) study in six countries tested the effects of a comprehensive smoking prevention approach after 24 (T3; N 5 10 751) and 30 months (T4; N 5 9282). The programme targeted four levels, i.e. adolescents in schools, school policies, parents and the community. In Portugal, 12.4 % of the T1 nonsmokers in the control group had started smoking at T4 compared to 7.9 % of the experimental group. Smoking onset in the experimental group was thus 36 % lower. In Finland, 32.4 % of the T1 nonsmokers started smoking compared to 27.6 % of the experimental group, implying a 15 % lower onset in the experimental group. In Spain, 33.0% of the T1 nonsmokers in the control group had started smoking, compared to 29.1 % of the experimental group, implying a 12 % lower onset. In The Netherlands, the ESFA programme 1
Effects of contextualized math instruction on problem solving of average and belowaverage achieving students
 Journal of Special Education
, 1999
The purpose of the study was to investigate the effect of contextualized math instruction on the problemsolving performance of 17 middle school students in one remedial class and 49 middle school averageachieving students in two prealgebra classes. The study employed experimental and quasiexperimental designs to compare the impact of word problem instruction and contextualized problem instruction on computation skills and problemsolving performance. Results showed that students in the contextualized problem remedial and prealgebra groups outperformed students in the word problem groups on a contextualized and a transfer problem. In an extended transfer activity, students in the remedial class applied what they had learned in order to plan and build two skateboard ramps. Results support the use of contextualized problems to enhance the problemsolving skills of students in general and remedial classes. All students, including those with learning difficulties, need to be mathematically proficient to a level that will allow them to &dquo;figure out&dquo; mathrelated problems they encounter in the community and in future work situations. Unfortunately, evidence clearly shows that many students, not just those in spe
Statistical Power and its subcomponents  missing and misunderstood concepts in Software Engineering Empirical Research
 Journal of Information and Software Technology
, 1997
Recently we have witnessed a welcomed increase in the amount of empirical evaluation of Software Engineering methods and concepts. It is hoped that this increase will lead to establishing Software Engineering as a welldefined subject with a sound scientifically proven underpinning rather than a topic based upon unsubstantiated theories and personal belief. For this to happen the empirical work must be of the highest standard. Unfortunately producing meaningful empirical evaluations is a highly hazardous activity, full of uncertainties and often unseen difficulties. Any researcher can overlook or neglect a seemingly innocuous factor, which in fact invalidates all of the work. More serious is that large sections of the communuity can overlook essential experimental design guidelines, which bring into question the validity of much of the work undertaken to date. In this paper, the authors address one such factor  Statistical Power Analysis. It is believed and will be demonstrated that a...
The presence of something or the absence of nothing: Increasing theoretical precision in management research
 Organizational Research Methods
, 2010
In management research, theory testing confronts a paradox described by Meehl in which designing studies with greater methodological rigor puts theories at less risk of falsification. This paradox exists because most management theories make predictions that are merely directional, such as stating that two variables will be positively or negatively related. As methodological rigor increases, the probability that an estimated effect will differ from zero likewise increases, and the likelihood of finding support for a directional prediction boils down to a coin toss. This paradox can be resolved by developing theories with greater precision, such that their propositions predict something more meaningful than deviations from zero. This article evaluates the precision of theories in management research, offers guidelines for making theories more precise, and discusses ways to overcome barriers to the pursuit of theoretical precision.
Methods for the Behavioral, Educational, and Social Sciences (MBESS) [Computer software and manual]. Retrievable from www.cran.rproject.org
, 2007
package for R (R Development Core Team, 2007b), an open source statistical programming language and environment. MBESS implements methods that are not widely available elsewhere, yet are especially helpful for the idiosyncratic techniques used within the behavioral, educational, and social sciences. The major categories of functions are those that relate to confidence interval formation for noncentral t, F, and � 2 parameters, confidence intervals for standardized effect sizes (which require noncentral distributions), and sample size planning issues from the power analytic and accuracy in parameter estimation perspectives. In addition, MBESS contains collections of other functions that should be helpful to substantive researchers and methodologists. MBESS is a longterm project that will continue to be updated and expanded so that important methods can continue to be made available to researchers in the behavioral, educational, and social sciences. R is an open source statistical programming language and environment for (essentially) all operating systems that has gained a widespread following in quantitative disciplines (R Development Core Team, 2007b). This following is perhaps most prevalent in the statistical sciences, where many published works now provide R routines