Using confidence intervals in withinsubject designs
 Psychonomic Bulletin & Review
, 1994
Cited by 178 (21 self)
Wolford, and two anonymous reviewers for very useful comments on earlier drafts of the manuscript. Correspondence may be addressed to
The effects of costless preplay communication: Experimental evidence from games with Paretoranked equilibria
, 2007
Severe Testing as a Basic Concept in a NeymanPearson Philosophy of Induction
 BRITISH JOURNAL FOR THE PHILOSOPHY OF SCIENCE
, 2006
"... Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and longstanding problems of N–P tests s ..."
Cited by 35 (14 self)
Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and longstanding problems of N–P tests stem from unclarity and confusion, even among N–P adherents, as to how a test’s (predata) error probabilities are to be used for (postdata) inductive inference as opposed to inductive behavior. We argue that the relevance of error probabilities is to ensure that only statistical hypotheses that have passed severe or probative tests are inferred from the data. The severity criterion supplies a metastatistical principle for evaluating proposed statistical inferences, avoiding classic fallacies from tests that are overly sensitive, as well as those not sensitive enough to particular errors and discrepancies.
Could Fisher, Jeffreys, and Neyman Have Agreed on Testing?
, 2002
"... Ronald Fisher advocated testing using pvalues; Harold Jeffreys proposed use of objective posterior probabilities of hypotheses; and Jerzy Neyman recommended testing with fixed error probabilities. Each was quite critical of the other approaches. ..."
Cited by 29 (2 self)
Ronald Fisher advocated testing using pvalues; Harold Jeffreys proposed use of objective posterior probabilities of hypotheses; and Jerzy Neyman recommended testing with fixed error probabilities. Each was quite critical of the other approaches.
Review Enrichment or depletion of a GO category within a class of genes: which test?
"... Motivation: A number of available program packages determine the significant enrichments and/or depletions of GO categories among a class of genes of interest. Whereas a correct formulation of the problem leads to a single exact null distribution, these GO tools use a large variety of statistical te ..."
Cited by 29 (1 self)
Motivation: A number of available program packages determine the significant enrichments and/or depletions of GO categories among a class of genes of interest. Whereas a correct formulation of the problem leads to a single exact null distribution, these GO tools use a large variety of statistical tests whose denominations often do not clarify the underlying pvalue computations. Summary: We review the different formulations of the problem and the tests they lead to: the binomial, chisquare, equality of two probabilities, Fisher’s exact, and hypergeometric tests. We clarify the relationships existing between these tests, in particular the equivalence between the hypergeometric test and Fisher’s exact test. We recall that the other tests are valid only for large samples, the test of equality of two probabilities and the chisquare test being equivalent. We discuss the appropriateness of one and twosided pvalues, as well as some discreteness and conservatism issues. 1
Estimating Functions for Discretely Sampled DiffusionType Models. Chapter of the Handbook of financial econometrics, AitSahalia and Hansen eds. http://home.uchicago.edu/ lhansen/handbook.htm Birgé
 in Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics
, 2004
"... Estimating functions provide a general framework for finding estimators and studying their properties in many different kinds of statistical models, including stochastic process models. An estimating function is a function of the data as well as of the parameter to be estimated. An estimator is obta ..."
Cited by 26 (9 self)
Estimating functions provide a general framework for finding estimators and studying their properties in many different kinds of statistical models, including stochastic process models. An estimating function is a function of the data as well as of the parameter to be estimated. An estimator is obtained by equating the estimating function to zero and solving the resulting
On the Evaluation of Document Analysis Components by Recall, Precision, and Accuracy
 Precision and Accuracy”, International Conference on Document Analysis and Recognition, India
, 1999
"... In document analysis, it is common to prove the usefulness of a component by an experimental evaluation. By applying the respective algorithms to a test sample, some effectiveness measures such as recall, precision, and accuracy are computed. The goal of such an evaluation is twofold: on the one ha ..."
Cited by 23 (0 self)
In document analysis, it is common to prove the usefulness of a component by an experimental evaluation. By applying the respective algorithms to a test sample, some effectiveness measures such as recall, precision, and accuracy are computed. The goal of such an evaluation is twofold: on the one hand it shows that the absolute effectiveness of the algorithm is acceptable for practical use. On the other hand, the evaluation can prove that the algorithm has a better or worse effectiveness than another algorithm. In this paper we argue that the experimental evaluation on relative small test sets  as is very common in document analysis  has to be taken with extreme care from a statistical point of view. In fact, it is surprising how weak statements derived from such evaluations are. 1 Introduction The task of document analysis is to transform printed documents into an equivalent electronic representation. Typical problems of document analysis systems are: image processing, layout se...
Geometric morphometrics: ten years of progress following the ‘revolution
 Italian Journal of Zoology
, 2004
"... The analysis of shape is a fundamental part of much biological research. As the field of statistics developed, so have the sophistication of the analysis of these types of data. This lead to multivariate morphometrics in which suites of measurements were analyzed together using canonical variates an ..."
Cited by 22 (0 self)
The analysis of shape is a fundamental part of much biological research. As the field of statistics developed, so have the sophistication of the analysis of these types of data. This lead to multivariate morphometrics in which suites of measurements were analyzed together using canonical variates analysis, principal components analysis, and related methods. In the 1980s, a fundamental change began in the nature of the data gathered and analyzed. This change focused on the coordinates of landmarks and the geometric information about their relative positions. As a byproduct of such an approach, results of multivariate analyses could be visualized as configurations of landmarks back in the original space of the organism rather than only as statistical scatter plots. This new approach, called “geometric morphometrics”, had benefits that lead Rohlf and Marcus (1993) to proclaim a “revolution” in morphometrics. In this paper, we briefly update the discussion in that paper and summarize the advances in the ten years since the paper by Rohlf and Marcus. We also speculate on future directions in morphometric analysis.
Knowledge Discovery Through Induction with Randomization Testing
, 1991
"... design IRT embodies a view of induction as a fourphase process (shown in Figure 1). The process alters a current model by generating a group of new competitor models, fitting those competitor models to data, comparing the competitors to each other, and then testing the statistical significance of t ..."
Cited by 15 (3 self)
design IRT embodies a view of induction as a fourphase process (shown in Figure 1). The process alters a current model by generating a group of new competitor models, fitting those competitor models to data, comparing the competitors to each other, and then testing the statistical significance of the competitors. The process is iterative  it can be repeated until no competitor can be found that is significantly better than the current model. current model generate competitors fit competitors compare competitors test significance Replace current model with competitor any competitor significantly better? No Yes continue searching? accept current model No Yes Figure 1: IRT's inductive process Generating competitors creates one or more models with a different structure than the current model. Examples include creating decision trees with different attributes and structure or classification rules with different conditions and operators. Each of these competitors is a candidate to replac...
The second order ancillary: A differential view with continuity. Bernoulli
, 2010
"... Second order approximate ancillaries have evolved as the primary ingredient for recent likelihood developmentin statistical inference. This uses quantile functions rather than the equivalent distribution functions, and the intrinsic ancillary contour is given explicitly as the plugin estimate of th ..."
Cited by 12 (4 self)
Second order approximate ancillaries have evolved as the primary ingredient for recent likelihood developmentin statistical inference. This uses quantile functions rather than the equivalent distribution functions, and the intrinsic ancillary contour is given explicitly as the plugin estimate of the vector quantile function. The derivation uses a Taylor expansion of the full quantile function, and the linear term gives a tangent to the observed ancillary contour. For the scalar parameter case, there is a vector field that integrates to give the ancillary contours, but for the vector case, there are multiple vector fields and the Frobenius conditions for mutual consistency may not hold. We demonstrate, however, that the conditions hold in a restricted way and that this verifies the second order ancillary contours in moderate deviations. The methodology can generate an appropriate exact ancillary when such exists or an approximate ancillary for the numerical or Monte Carlo calculation of pvalues and confidence quantiles. Examples are given, including nonlinear regression and several enigmatic examples from the literature.