Results 1  10
of
17
The earth is round (p < .05
 American Psychologist
, 1994
"... After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is ..."
Abstract

Cited by 346 (0 self)
 Add to MetaCart
(Show Context)
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences,
Could Fisher, Jeffreys, and Neyman Have Agreed on Testing?
, 2002
"... Ronald Fisher advocated testing using pvalues; Harold Jeffreys proposed use of objective posterior probabilities of hypotheses; and Jerzy Neyman recommended testing with fixed error probabilities. Each was quite critical of the other approaches. ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
Ronald Fisher advocated testing using pvalues; Harold Jeffreys proposed use of objective posterior probabilities of hypotheses; and Jerzy Neyman recommended testing with fixed error probabilities. Each was quite critical of the other approaches.
Reporting and interpretation in genomewide association studies
 Int. J. Epidemiol
, 2008
"... Background In the context of genomewide association studies we critique a number of methods that have been suggested for flagging associations for further investigation. Methods The Pvalue is by far the most commonly used measure, but requires careful calibration when the a priori probability of a ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Background In the context of genomewide association studies we critique a number of methods that have been suggested for flagging associations for further investigation. Methods The Pvalue is by far the most commonly used measure, but requires careful calibration when the a priori probability of an association is small, and discards information by not considering the power associated with each test. The qvalue is a frequentist method by which the false discovery rate (FDR) may be controlled. Results We advocate the use of the Bayes factor as a summary of the information in the data with respect to the comparison of the null and alternative hypotheses, and describe a recentlyproposed approach to the calculation of the Bayes factor that is easily implemented. The combination of data across studies is straightforward using the Bayes factor approach, as are power calculations. Conclusions The Bayes factor and the qvalue provide complementary information and when used in addition to the Pvalue may be used to reduce the number of reported findings that are subsequently not reproduced.
Standing waters
 The Fresh Waters of Scotland
, 1994
"... quality trend detection in the presence of changes in analytical laboratory protocols ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
quality trend detection in the presence of changes in analytical laboratory protocols
Alphabet Soup Blurring the Distinctions Between p’s and �’s in Psychological Research
"... Abstract. Confusion over the reporting and interpretation of results of commonly employed classical statistical tests is recorded in a sample of 1,645 papers from 12 psychology journals for the period 1990 through 2002. The confusion arises because researchers mistakenly believe that their interpret ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Confusion over the reporting and interpretation of results of commonly employed classical statistical tests is recorded in a sample of 1,645 papers from 12 psychology journals for the period 1990 through 2002. The confusion arises because researchers mistakenly believe that their interpretation is guided by a single unified theory of statistical inference. But this is not so: classical statistical testing is a nameless amalgamation of the rival and often contradictory approaches developed by Ronald Fisher, on the one hand, and Jerzy Neyman and Egon Pearson, on the other. In particular, there is extensive failure to acknowledge the incompatibility of Fisher’s evidential p value with the Type I error rate, α, of Neyman–Pearson statistical orthodoxy. The distinction between evidence (p’s) and errors (α’s) is not trivial. Rather, it reveals the basic differences underlying Fisher’s ideas on significance testing and inductive inference, and Neyman–Pearson views on hypothesis testing and inductive behavior. So complete is this misunderstanding over measures of evidence
JS: Why We Don’t Really Know What Statistical Significance Means: Implications for Educators
 Journal of Marketing Education
"... shortcomings are our responsibility. Why We Don’t Really Know What “Statistical Significance ” Means: A Major Educational Failure The Neyman–Pearson theory of hypothesis testing, with the Type I error rate, α, as the significance level, is widely regarded as statistical testing orthodoxy. Fisher’s m ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
shortcomings are our responsibility. Why We Don’t Really Know What “Statistical Significance ” Means: A Major Educational Failure The Neyman–Pearson theory of hypothesis testing, with the Type I error rate, α, as the significance level, is widely regarded as statistical testing orthodoxy. Fisher’s model of significance testing, where the evidential p value denotes the level of significance, nevertheless dominates statistical testing practice. This paradox has occurred because these two incompatible theories of classical statistical testing have been anonymously mixed together, creating the false impression of a single, coherent model of statistical inference. We show that this hybrid approach to testing, with its misleading p < α statistical significance criterion, is common in marketing research textbooks, as well as in a large random sample of papers from twelve marketing journals. That is, researchers attempt the impossible by simultaneously interpreting the p value as a Type I error rate and as a measure of evidence against the null hypothesis. The upshot is that many investigators do not know what our most cherished, and ubiquitous, research desideratum—“statistical significance”—really means. This, in turn, signals an educational
Confidence of compliance: a Bayesian approach for percentile standards
 Water Resources
, 2001
"... AbstractRules Tor assessing compliance with percentile standards commonly limit the number of exceedances permitted in a batch or samples taken over a defined assessment period. Such rules are commonly developed using classical statistical methods. Results from alternative Bayesian methodsare pres ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
AbstractRules Tor assessing compliance with percentile standards commonly limit the number of exceedances permitted in a batch or samples taken over a defined assessment period. Such rules are commonly developed using classical statistical methods. Results from alternative Bayesian methodsare presented (using betadistributed prior information and a binomial likelihood), resulting in "confidence of compliance " graphs. These allow simple reading or the wnsumer's risk and (he supplier's risks Tor any proposed rule. The influence or the prior assumptions required by the Bayesian technique an the confidence results is demonstrated, using two rererenee priors (unifom and Jeflreys') and also using optimistic and pessimistic userdefined priors. All four give less pessimistic results than does the classical technique, because interpreting classical results as "confidence 01compliance " actually invokes a Bayesian approach with an extreme prior distribution. Jefieys ' prior is shown to be the most generally appropriate choice of prior distribution. Cost savings can be expected using rules based on this approach. 8 2001 Elsevier Science Ltd. All rights reserved Key wordspercentile standards, supplier's risk, consumer's risk, bayesian approach, exceedance probability, prior distribution NOMENCLATURE
Data Analysis Considerations in Producing ‘Comparable ’ Information for Water Quality Management Purposes
"... Water quality monitoring is being used in local, regional, and national scales to measure how water quality variables behave in the natural environment. A common problem, which arises from monitoring, is how to relate information contained in data to the information needed by water resource manageme ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Water quality monitoring is being used in local, regional, and national scales to measure how water quality variables behave in the natural environment. A common problem, which arises from monitoring, is how to relate information contained in data to the information needed by water resource management for decisionmaking. This is generally attempted through statistical analysis of the monitoring data. However, how the selection of methods with which to routinely analyze the data affects the quality and comparability of information produced is not as well understood as may first appear. To help understand the connectivity between the selection of methods for routine data analysis and the information produced to support management, the following three tasks were performed. An examination of the methods that are currently being used to analyze water quality monitoring data, including published criticisms of them. An exploration of how the selection of methods to analyze water quality data can impact the comparability of information used for water quality management purposes. Development of options by which data analysis methods employed in water quality