Results 1  10
of
80
Interval estimation for a binomial proportion
 Statist. Sci
, 2001
"... Abstract. We revisit the problem of interval estimation of a binomial proportion. The erratic behavior of the coverage probability of the standardWaldconfidence interval has previously been remarkedon in the literature (Blyth andStill, Agresti andCoull, Santner andothers). We begin by showing that t ..."
Abstract

Cited by 80 (2 self)
 Add to MetaCart
Abstract. We revisit the problem of interval estimation of a binomial proportion. The erratic behavior of the coverage probability of the standardWaldconfidence interval has previously been remarkedon in the literature (Blyth andStill, Agresti andCoull, Santner andothers). We begin by showing that the chaotic coverage properties of the Waldinterval are far more persistent than is appreciated. Furthermore, common textbook prescriptions regarding its safety are misleading and defective in several respects andcannot be trusted. This leads us to consideration of alternative intervals. A number of natural alternatives are presented, each with its motivation and context. Each interval is examinedfor its coverage probability andits length. Basedon this analysis, we recommendthe Wilson interval or the equaltailedJeffreys prior interval for small n andthe interval suggestedin Agresti andCoull for larger n. We also provide an additional frequentist justification for use of the Jeffreys interval. Key words and phrases: Bayes, binomial distribution, confidence intervals, coverage probability, Edgeworth expansion, expected length, Jeffreys prior, normal approximation, posterior.
Severe Testing as a Basic Concept in a NeymanPearson Philosophy of Induction
 BRITISH JOURNAL FOR THE PHILOSOPHY OF SCIENCE
, 2006
"... Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and longstanding problems of N–P tests s ..."
Abstract

Cited by 35 (14 self)
 Add to MetaCart
Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and longstanding problems of N–P tests stem from unclarity and confusion, even among N–P adherents, as to how a test’s (predata) error probabilities are to be used for (postdata) inductive inference as opposed to inductive behavior. We argue that the relevance of error probabilities is to ensure that only statistical hypotheses that have passed severe or probative tests are inferred from the data. The severity criterion supplies a metastatistical principle for evaluating proposed statistical inferences, avoiding classic fallacies from tests that are overly sensitive, as well as those not sensitive enough to particular errors and discrepancies.
Design of Experiments to Evaluate CAD Algorithms: Which Improvements Are Due to Improved Heuristic and Which Are Merely Due to Chance?
, 1998
"... ..."
The DempsterShafer calculus for statisticians
 International Journal of Approximate Reasoning
, 2007
"... The DempsterShafer (DS) theory of probabilistic reasoning is presented in terms of a semantics whereby every meaningful formal assertion is associated with a triple (p, q, r) where p is the probability “for ” the assertion, q is the probability “against” the assertion, and r is the probability of “ ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
The DempsterShafer (DS) theory of probabilistic reasoning is presented in terms of a semantics whereby every meaningful formal assertion is associated with a triple (p, q, r) where p is the probability “for ” the assertion, q is the probability “against” the assertion, and r is the probability of “don’t know”. Arguments are presented for the necessity of “don’t know”. Elements of the calculus are sketched, including the extension of a DS model from a margin to a full state space, and DS combination of independent DS uncertainty assessments on the full space. The methodology is applied to inference and prediction from Poisson counts, including an introduction to the use of jointree model structure to simplify and shorten computation. The relation of DS theory to statistical significance testing is elaborated, introducing along the way the new concept of “dull ” null hypothesis. Key words: DempsterShafer; belief functions; state space; Poisson model; jointree computation; statistical significance; dull null hypothesis 1
Exploiting the generic viewpoint assumption
 IJCV
, 1996
"... The ¨generic viewpointässumption states that an observer is not in a special position relative to the scene. It is commonly used to disqualify scene interpretations that assume special viewpoints, following a binary decision that the viewpoint was either generic or accidental. In this paper, we appl ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
The ¨generic viewpointässumption states that an observer is not in a special position relative to the scene. It is commonly used to disqualify scene interpretations that assume special viewpoints, following a binary decision that the viewpoint was either generic or accidental. In this paper, we apply Bayesian statistics to quantify the probability of a view, and so derive a useful tool to estimate scene parameters. This approach may increase the scope and accuracy of scene estimates. It applies to a range of vision problems. We show shape from shading examples, where we rank shapes or reflectance functions in cases where these are otherwise unknown. The rankings agree with the perceived values.
Inference and Hierarchical Modeling in the Social Sciences
, 1995
"... this paper I (1) examine three levels of inferential strength supported by typical social science datagathering methods, and call for a greater degree of explicitness, when HMs and other models are applied, in identifying which level is appropriate; (2) reconsider the use of HMs in school effective ..."
Abstract

Cited by 21 (6 self)
 Add to MetaCart
this paper I (1) examine three levels of inferential strength supported by typical social science datagathering methods, and call for a greater degree of explicitness, when HMs and other models are applied, in identifying which level is appropriate; (2) reconsider the use of HMs in school effectiveness studies and metaanalysis from the perspective of causal inference; and (3) recommend the increased use of Gibbs sampling and other Markovchain Monte Carlo (MCMC) methods in the application of HMs in the social sciences, so that comparisons between MCMC and betterestablished fitting methodsincluding full or restricted maximum likelihood estimation based on the EM algorithm, Fisher scoring or iterative generalized least squaresmay be more fully informed by empirical practice.
Bayesian Decision Theory, the Maximum Local Mass Estimate, and Color Constancy
 IN PROCEEDINGS: FIFTH INTERNATIONAL CONFERENCE ON COMPUTER VISION, PP 210217, (IEEE COMPUTER
, 1995
"... Vision algorithms are often developed in a Bayesian framework. Two estimators are commonly used: maximum a posteriori (MAP), and minimum mean squared error (MMSE). We argue that neither is appropriate for perception problems. The MAP estimator makes insufficient use of structure in the posterior pro ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
Vision algorithms are often developed in a Bayesian framework. Two estimators are commonly used: maximum a posteriori (MAP), and minimum mean squared error (MMSE). We argue that neither is appropriate for perception problems. The MAP estimator makes insufficient use of structure in the posterior probability. The squared error penalty of the MMSE estimator does not reflect typical penalties. We describe a new
Whereof One Cannot Speak: When input distributions are unknown
, 1996
"... One of the major criticisms of probabilistic risk assessment is that the requisite input distributions are often not available. Several approaches to this problem have been suggested, including creating a library of standard empirically fitted distributions, employing maximum entropy criteria to syn ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
One of the major criticisms of probabilistic risk assessment is that the requisite input distributions are often not available. Several approaches to this problem have been suggested, including creating a library of standard empirically fitted distributions, employing maximum entropy criteria to synthesize distributions from a priori constraints, and even using `default' inputs such as the triangular distribution. Since empirical information is often sparse, analysts commonly must make assumptions to select the input distributions without empirical justification. This practice diminishes the credibility of the assessment and any decisions based on it. There is no absolute necessity, however, of assuming particular shapes for input distributions in probabilistic risk assessments. It is possible to make the needed calculations using inputs specified only as bounds on probability distributions. We describe such bounds for a variety of circumstances where empirical information is extremely limited, and illustrate how these bounds can be used in computations to represent uncertainty about input distributions far more comprehensively than is possible with current approaches.
A tutorial on conformal prediction
 Journal of Machine Learning Research
, 2008
"... Conformal prediction uses past experience to determine precise levels of confidence in new predictions. Given an error probability ε, together with a method that makes a prediction ˆy of a label y, it produces a set of labels, typically containing ˆy, that also contains y with probability 1 − ε. Con ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Conformal prediction uses past experience to determine precise levels of confidence in new predictions. Given an error probability ε, together with a method that makes a prediction ˆy of a label y, it produces a set of labels, typically containing ˆy, that also contains y with probability 1 − ε. Conformal prediction can be applied to any method for producing ˆy: a nearestneighbor method, a supportvector machine, ridge regression, etc. Conformal prediction is designed for an online setting in which labels are predicted successively, each one being revealed before the next is predicted. The most novel and valuable feature of conformal prediction is that if the successive examples are sampled independently from the same distribution, then the successive predictions will be right 1 − ε of the time, even though they are based on an accumulating data set rather than on independent data sets. In addition to the model under which successive examples are sampled independently, other online compression models can also use conformal prediction. The widely used Gaussian linear model is one of these. This tutorial presents a selfcontained account of the theory of conformal prediction and works through several numerical examples. A more comprehensive treatment of the topic is provided in
Improved likelihood inference for discrete data
 J. R. Statist. Soc. B
, 2006
"... Summary. Discrete data, particularly count and contingency table data, are typically analyzed using methods that are accurate to first order, such as normal approximations for maximum likelihood estimators. By contrast continuous data can quite generally be analyzed using third order procedures, wit ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
Summary. Discrete data, particularly count and contingency table data, are typically analyzed using methods that are accurate to first order, such as normal approximations for maximum likelihood estimators. By contrast continuous data can quite generally be analyzed using third order procedures, with major improvements in accuracy and with intrinsic separation of information concerning parameter components. This paper extends these higher order results to discrete data, yielding a methodology that is widely applicable and accurate to second order. The extension can be described in terms of an approximating exponential model expressed in terms of a score variable. The development is outlined and the flexibility of the approach illustrated by examples. 1.