Results 1 - 10
of
11
Testing the Significance of Attribute Interactions
- In Proc. of 21st International Conference on Machine Learning (ICML
, 2004
"... Attribute interactions are the irreducible dependencies between attributes. Interactions underlie feature relevance and selection, the structure of joint probability and classification models: if and only if the attributes interact, they should be connected. While the issue of 2-way interactions, es ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Attribute interactions are the irreducible dependencies between attributes. Interactions underlie feature relevance and selection, the structure of joint probability and classification models: if and only if the attributes interact, they should be connected. While the issue of 2-way interactions, especially of those between an attribute and the label, has already been addressed, we introduce an operational definition of a generalized n-way interaction by highlighting two models: the reductionistic part-to-whole approximation, where the model of the whole is reconstructed from models of the parts, and the holistic reference model, where the whole is modelled directly.
Information, Divergence and Risk for Binary Experiments
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2009
"... We unify f-divergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROC-curves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We unify f-divergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROC-curves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all are related to cost-sensitive binary classification. As well as developing relationships between generative and discriminative views of learning, the new machinery leads to tight and more general surrogate regret bounds and generalised Pinsker inequalities relating f-divergences to variational divergence. The new viewpoint also illuminates existing algorithms: it provides a new derivation of Support Vector Machines in terms of divergences and relates Maximum Mean Discrepancy to Fisher Linear Discriminants.
Modelling modelled
- S.E.E.D. Journal
"... A model is one of the most fundamental concepts: it is a formal and generalized explanation of a phenomenon. Only with models we can bridge the particulars and predict the unknown. Virtually all our intellectual work turns around finding models, evaluating models, using models. Because models are so ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A model is one of the most fundamental concepts: it is a formal and generalized explanation of a phenomenon. Only with models we can bridge the particulars and predict the unknown. Virtually all our intellectual work turns around finding models, evaluating models, using models. Because models are so pervasive, it makes sense to take a look at modelling itself. We will approach this problem, of course, by
Preprint 1 DEMPSTER-SHAFER INFERENCE WITH WEAK BELIEFS
"... Abstract: Beliefs specified for predicting an unobserved realization of pivotal variables in the context of the fiducial and Dempster-Shafer (DS) inference can be weakened for credible inference. We consider predictive random sets for predicting an unobserved random sample from a known distribution, ..."
Abstract
- Add to MetaCart
Abstract: Beliefs specified for predicting an unobserved realization of pivotal variables in the context of the fiducial and Dempster-Shafer (DS) inference can be weakened for credible inference. We consider predictive random sets for predicting an unobserved random sample from a known distribution, e.g., the uniform distribution U(0, 1). More specifically, we choose our beliefs for inference in two steps: (i) define a class of weak beliefs in terms of DS models for predicting an unobserved sample, and (ii) seek a belief within that class to balance the trade-off between credibility and efficiency of the resulting DS inference. We call this approach the Maximal Belief (MB) method. The MB method is illustrated with two examples: (1) inference about µ based on a sample n from the Gaussian model N(µ,1), and (2) inference about the number of outliers (µi ̸ = 0) based on the observed data ind X1,..., Xn with the model Xi ∼ N(µi,1). The first example shows that MB-DS analysis does a type of conditional inference. The second example demonstrates that MB posterior probabilities are easy to interpret for hypothesis testing.
Forecasting without significance tests?
, 2008
"... Statistical significance testing has little useful purpose in business forecasting, and other tools are to be preferred. For selecting or ranking forecasting methods (especially those based on models) there exist simple but powerful and practical alternative approaches that are not tests. It is sugg ..."
Abstract
- Add to MetaCart
Statistical significance testing has little useful purpose in business forecasting, and other tools are to be preferred. For selecting or ranking forecasting methods (especially those based on models) there exist simple but powerful and practical alternative approaches that are not tests. It is suggested that forecasters place less emphasis on p values and more emphasis on the predictive ability of models. 1
Probabilistic Inference: Test and Multiple Tests
, 2009
"... In this paper, we view that real-world scientific inference about an assertion of interest on unknown quantities is to produce a probability triplet (p,q,r), conditioned on available data. The probabilities p and q are for and against the truth of the assertion, whereas r = 1 − p − q is the remainin ..."
Abstract
- Add to MetaCart
In this paper, we view that real-world scientific inference about an assertion of interest on unknown quantities is to produce a probability triplet (p,q,r), conditioned on available data. The probabilities p and q are for and against the truth of the assertion, whereas r = 1 − p − q is the remaining probability called the probability of “don’t know”. Such a (p,q,r)-formulation provides a promising way of representing realistic uncertainty assessment in statistical inference. With a brief discussion of what we call inferential models for producing (p,q,r) probability triplets for assertions, we focus on a particular inferential model for inference about an unobserved sorted uniform sample. We show how this inferential model can be used for (i) single tests, (ii) robust estimation of the empirical null distribution in the context of the local FDR method of Bradley Efron, and (iii) largescale simultaneous hypothesis problems, including the many-normal-means problem and the problem of identifying significantly expressed genes in microarray data analysis. These examples indicate that hypothesis testing problems can be formulated and solved in a new way of probabilistic inference.
Preprint of the Book Chapter: “Bayesian Versus Frequentist Inference”
"... Throughout this book, the topic of order-restricted inference is dealt with almost exclusively from a Bayesian perspective. Some readers may wonder why the other main school for statistical inference – frequentist inference – has received so little attention here. Isn’t it true that in the field of ..."
Abstract
- Add to MetaCart
Throughout this book, the topic of order-restricted inference is dealt with almost exclusively from a Bayesian perspective. Some readers may wonder why the other main school for statistical inference – frequentist inference – has received so little attention here. Isn’t it true that in the field of psychology, almost all inference is frequentist inference? The first goal of this chapter is to highlight why frequentist inference is a less-thanideal method for statistical inference. The most fundamental limitation of standard frequentist inference is that it does not condition on the observed data. The resulting paradoxes have sparked a philosophical debate that statistical practitioners have conveniently ignored. What cannot be so easily ignored are the practical limitations of frequentist inference, such as its restriction to nested model comparisons. The second goal of this chapter is to highlight the theoretical and practical advantages of a Bayesian analysis. From a theoretical perspective, Bayesian inference is principled and prescriptive, and – in contrast to frequentist inference – a method that does condition on the observed data. From a practical perspective, Bayesian inference
The
"... evidence for and against astronomical impacts on climate change and mass extinctions: A review ..."
Abstract
- Add to MetaCart
evidence for and against astronomical impacts on climate change and mass extinctions: A review
The
, 905
"... evidence for and against astronomical impacts on climate change and mass extinctions: A review ..."
Abstract
- Add to MetaCart
evidence for and against astronomical impacts on climate change and mass extinctions: A review

