Results 1  10
of
22
The interplay of bayesian and frequentist analysis
 Statist. Sci
, 2004
"... Statistics has struggled for nearly a century over the issue of whether the Bayesian or frequentist paradigm is superior. This debate is far from over and, indeed, should continue, since there are fundamental philosophical and pedagogical issues at stake. At the methodological level, however, the fi ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
Statistics has struggled for nearly a century over the issue of whether the Bayesian or frequentist paradigm is superior. This debate is far from over and, indeed, should continue, since there are fundamental philosophical and pedagogical issues at stake. At the methodological level, however, the fight has become considerably muted, with the recognition that each approach has a great deal to contribute to statistical practice and each is actually essential for full development of the other approach. In this article, we embark upon a rather idiosyncratic walk through some of these issues. Key words and phrases: Admissibility; Bayesian model checking; conditional frequentist; confidence intervals; consistency; coverage; design; hierarchical models; nonparametric
Testing the Significance of Attribute Interactions
 In Proc. of 21st International Conference on Machine Learning (ICML
, 2004
"... Attribute interactions are the irreducible dependencies between attributes. Interactions underlie feature relevance and selection, the structure of joint probability and classification models: if and only if the attributes interact, they should be connected. While the issue of 2way interactions, es ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Attribute interactions are the irreducible dependencies between attributes. Interactions underlie feature relevance and selection, the structure of joint probability and classification models: if and only if the attributes interact, they should be connected. While the issue of 2way interactions, especially of those between an attribute and the label, has already been addressed, we introduce an operational definition of a generalized nway interaction by highlighting two models: the reductionistic parttowhole approximation, where the model of the whole is reconstructed from models of the parts, and the holistic reference model, where the whole is modelled directly.
Information, Divergence and Risk for Binary Experiments
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2009
"... We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all are related to costsensitive binary classification. As well as developing relationships between generative and discriminative views of learning, the new machinery leads to tight and more general surrogate regret bounds and generalised Pinsker inequalities relating fdivergences to variational divergence. The new viewpoint also illuminates existing algorithms: it provides a new derivation of Support Vector Machines in terms of divergences and relates Maximum Mean Discrepancy to Fisher Linear Discriminants.
Multiple testing in statistical analysis of systemsbased information retrieval experiments
 ACM TOIS
"... Highquality reusable test collections and formal statistical hypothesis testing together support a rigorous experimental environment for information retrieval research. But as Armstrong et al. [2009b] recently argued, global analysis of experiments suggests that there has actually been little real ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Highquality reusable test collections and formal statistical hypothesis testing together support a rigorous experimental environment for information retrieval research. But as Armstrong et al. [2009b] recently argued, global analysis of experiments suggests that there has actually been little real improvement in ad hoc retrieval effectiveness over time. We investigate this phenomenon in the context of simultaneous testing of many hypotheses using a fixed set of data. We argue that the most common approaches to significance testing ignore a great deal of information about the world. Taking into account even a fairly small amount of this information can lead to very different conclusions about systems than those that have appeared in published literature. We demonstrate how to model a set of IR experiments for analysis both mathematically and practically, and show that doing so can cause pvalues from statistical hypothesis tests to increase by orders of magnitude. This has major consequences on the interpretation of experimental results using reusable test collections: it is very difficult to conclude that anything is significant once we have modeled many of the sources of randomness in experimental design and analysis.
Modelling modelled
 S.E.E.D. Journal
"... A model is one of the most fundamental concepts: it is a formal and generalized explanation of a phenomenon. Only with models we can bridge the particulars and predict the unknown. Virtually all our intellectual work turns around finding models, evaluating models, using models. Because models are so ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
A model is one of the most fundamental concepts: it is a formal and generalized explanation of a phenomenon. Only with models we can bridge the particulars and predict the unknown. Virtually all our intellectual work turns around finding models, evaluating models, using models. Because models are so pervasive, it makes sense to take a look at modelling itself. We will approach this problem, of course, by
Forecasting without significance tests?
, 2008
"... Statistical significance testing has little useful purpose in business forecasting, and other tools are to be preferred. For selecting or ranking forecasting methods (especially those based on models) there exist simple but powerful and practical alternative approaches that are not tests. It is sugg ..."
Abstract
 Add to MetaCart
Statistical significance testing has little useful purpose in business forecasting, and other tools are to be preferred. For selecting or ranking forecasting methods (especially those based on models) there exist simple but powerful and practical alternative approaches that are not tests. It is suggested that forecasters place less emphasis on p values and more emphasis on the predictive ability of models. 1
Probabilistic Inference: Test and Multiple Tests
, 2009
"... In this paper, we view that realworld scientific inference about an assertion of interest on unknown quantities is to produce a probability triplet (p,q,r), conditioned on available data. The probabilities p and q are for and against the truth of the assertion, whereas r = 1 − p − q is the remainin ..."
Abstract
 Add to MetaCart
In this paper, we view that realworld scientific inference about an assertion of interest on unknown quantities is to produce a probability triplet (p,q,r), conditioned on available data. The probabilities p and q are for and against the truth of the assertion, whereas r = 1 − p − q is the remaining probability called the probability of “don’t know”. Such a (p,q,r)formulation provides a promising way of representing realistic uncertainty assessment in statistical inference. With a brief discussion of what we call inferential models for producing (p,q,r) probability triplets for assertions, we focus on a particular inferential model for inference about an unobserved sorted uniform sample. We show how this inferential model can be used for (i) single tests, (ii) robust estimation of the empirical null distribution in the context of the local FDR method of Bradley Efron, and (iii) largescale simultaneous hypothesis problems, including the manynormalmeans problem and the problem of identifying significantly expressed genes in microarray data analysis. These examples indicate that hypothesis testing problems can be formulated and solved in a new way of probabilistic inference.
Preprint of the Book Chapter: “Bayesian Versus Frequentist Inference”
"... Throughout this book, the topic of orderrestricted inference is dealt with almost exclusively from a Bayesian perspective. Some readers may wonder why the other main school for statistical inference – frequentist inference – has received so little attention here. Isn’t it true that in the field of ..."
Abstract
 Add to MetaCart
Throughout this book, the topic of orderrestricted inference is dealt with almost exclusively from a Bayesian perspective. Some readers may wonder why the other main school for statistical inference – frequentist inference – has received so little attention here. Isn’t it true that in the field of psychology, almost all inference is frequentist inference? The first goal of this chapter is to highlight why frequentist inference is a lessthanideal method for statistical inference. The most fundamental limitation of standard frequentist inference is that it does not condition on the observed data. The resulting paradoxes have sparked a philosophical debate that statistical practitioners have conveniently ignored. What cannot be so easily ignored are the practical limitations of frequentist inference, such as its restriction to nested model comparisons. The second goal of this chapter is to highlight the theoretical and practical advantages of a Bayesian analysis. From a theoretical perspective, Bayesian inference is principled and prescriptive, and – in contrast to frequentist inference – a method that does condition on the observed data. From a practical perspective, Bayesian inference
The
"... evidence for and against astronomical impacts on climate change and mass extinctions: A review ..."
Abstract
 Add to MetaCart
evidence for and against astronomical impacts on climate change and mass extinctions: A review