Results 1  10
of
10
A New MetricBased Approach to Model Selection
 In Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI97
, 1997
"... We introduce a new approach to model selection that performs better than the standard complexitypenalization and holdout error estimation techniques in many cases. The basic idea is to exploit the intrinsic metric structure of a hypothesis space, as determined by the natural distribution of unlabel ..."
Abstract

Cited by 42 (5 self)
 Add to MetaCart
We introduce a new approach to model selection that performs better than the standard complexitypenalization and holdout error estimation techniques in many cases. The basic idea is to exploit the intrinsic metric structure of a hypothesis space, as determined by the natural distribution of unlabeled training patterns, and use this metric as a reference to detect whether the empirical error estimates derived from a small (labeled) training sample can be trusted in the region around an empirically optimal hypothesis. Using simple metric intuitions we develop new geometric strategies for detecting overfitting and performing robust yet responsive model selection in spaces of candidate functions. These new metricbased strategies dramatically outperform previous approaches in experimental studies of classical polynomial curve fitting. Moreover, the technique is simple, efficient, and can be applied to most function learning tasks. The only requirement is access to an auxiliary collection ...
MetricBased Methods for Adaptive Model Selection and Regularization
 Machine Learning
, 2001
"... We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a metric structure on hypotheses by determining the discrepancy between their predictions across the di ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a metric structure on hypotheses by determining the discrepancy between their predictions across the distribution of unlabeled data. We show how this metric can be used to detect untrustworthy training error estimates, and devise novel model selection strategies that exhibit theoretical guarantees against overtting (while still avoiding under tting). We then extend the approach to derive a general training criterion for supervised learningyielding an adaptive regularization method that uses unlabeled data to automatically set regularization parameters. This new criterion adjusts its regularization level to the specic set of training data received, and performs well on a variety of regression and conditional density estimation tasks. The only proviso for these methods is that s...
Characterizing the Generalization Performance of Model Selection Strategies
 In ICML97
, 1997
"... : We investigate the structure of model selection problems via the bias/variance decomposition. In particular, we characterize the essential structure of a model selection task by the bias and variance profiles it generates over the sequence of hypothesis classes. This leads to a new understanding o ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
: We investigate the structure of model selection problems via the bias/variance decomposition. In particular, we characterize the essential structure of a model selection task by the bias and variance profiles it generates over the sequence of hypothesis classes. This leads to a new understanding of complexitypenalization methods: First, the penalty terms in effect postulate a particular profile for the variances as a function of model complexity if the postulated and true profiles do not match, then systematic underfitting or overfitting results, depending on whether the penalty terms are too large or too small. Second, it is usually best to penalize according to the true variances of the task, and therefore no fixed penalization strategy is optimal across all problems. We then use this bias/variance characterization to identify the notion of easy and hard model selection problems. In particular, we show that if the variance profile grows too rapidly in relation to the biases t...
An Adaptive Regularization Criterion for Supervised Learning
 Proceedings of ICML'2000
, 2000
"... We introduce a new regularization criterion that exploits unlabeled data to adaptively control hypothesiscomplexity in general supervised learning tasks. The technique is based on an abstract metricspace view of supervised learning that has been successfully applied to model selection in pre ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
We introduce a new regularization criterion that exploits unlabeled data to adaptively control hypothesiscomplexity in general supervised learning tasks. The technique is based on an abstract metricspace view of supervised learning that has been successfully applied to model selection in previous research. The new regularization criterion we introduce involves no free parameters and yet performs well on a variety of regression and conditional density estimation tasks. The only proviso is that sucient unlabeled training data be available. We demonstrate the eectiveness of our approach on learning radial basis functions and polynomials for regression, and learning logistic regression models for conditional density estimation. 1. Introduction In the canonical supervised learning task one is given a training set hx 1 ; y 1 i; :::; hx t ; y t i and attempts to infer a hypothesis function h : X ! Y that achieves a small prediction error err(h(x); y) on future test exampl...
Ordering and finding the best of K>2 supervised learning algorithms
 IEEE T. Pattern. Anal
, 2006
"... Abstract—Given a data set and a number of supervised learning algorithms, we would like to find the algorithm with the smallest expected error. Existing pairwise tests allow a comparison of two algorithms only; range tests and ANOVA check whether multiple algorithms have the same expected error and ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract—Given a data set and a number of supervised learning algorithms, we would like to find the algorithm with the smallest expected error. Existing pairwise tests allow a comparison of two algorithms only; range tests and ANOVA check whether multiple algorithms have the same expected error and cannot be used for finding the smallest. We propose a methodology, the MultiTest algorithm, whereby we order supervised learning algorithms taking into account 1) the result of pairwise statistical tests on expected error (what the data tells us), and 2) our prior preferences, e.g., due to complexity. We define the problem in graphtheoretic terms and propose an algorithm to find the “best ” learning algorithm in terms of these two criteria, or in the more general case, order learning algorithms in terms of their “goodness. ” Simulation results using five classification algorithms on 30 data sets indicate the utility of the method. Our proposed method can be generalized to regression and other loss functions by using a suitable pairwise test. Index Terms—Machine learning, classifier design and evaluation, experimental design. æ 1
Characterizing the Generalization Performance of Model Selection Strategies
, 1997
"... : We investigate the structure of model selection problems via the bias/variance decomposition. In particular, we characterize the essential structure of a model selection task by the bias and variance profiles it generates over the sequence of hypothesis classes. This leads to a new understanding o ..."
Abstract
 Add to MetaCart
: We investigate the structure of model selection problems via the bias/variance decomposition. In particular, we characterize the essential structure of a model selection task by the bias and variance profiles it generates over the sequence of hypothesis classes. This leads to a new understanding of complexitypenalization methods: First, the penalty terms in effect postulate a particular profile for the variances as a function of model complexity if the postulated and true profiles do not match, then systematic underfitting or overfitting results, depending on whether the penalty terms are too large or too small. Second, it is usually best to penalize according to the true variances of the task, and therefore no fixed penalization strategy is optimal across all problems. We then use this bias/variance characterization to identify the notion of easy and hard model selection problems. In particular, we show that if the variance profile grows too rapidly in relation to the biases t...
CROSSVALIDATION FOR SUPERVISED LEARNING
, 2005
"... I would like to thank Prof. Dr. Ethem Alpaydın for his supervision in this Ph.D. study and for his support and advises. I am also grateful to all my friends, my family, my teachers and especially my love AYSEL (ISELL). I would also thank to all Matlab helpers including Oya Aran, Onur Dikmen, Berk Gö ..."
Abstract
 Add to MetaCart
I would like to thank Prof. Dr. Ethem Alpaydın for his supervision in this Ph.D. study and for his support and advises. I am also grateful to all my friends, my family, my teachers and especially my love AYSEL (ISELL). I would also thank to all Matlab helpers including Oya Aran, Onur Dikmen, Berk Gökberk, Itır Karaç, Mehmet Aydın Ula¸s. I also want to thank Arzucan Özgür for her support in printing and submitting this thesis. Without their support, this thesis would not been accomplished.
DELPHI Collaboration
, 2006
"... In the reaction e + e − → WW → (q1¯q2)(q3¯q4) the usual hadronization models treat the colour singlets q1¯q2 and q3¯q4 coming from two W bosons independently. However, since the final state partons may coexist in space and time, crosstalk between the two evolving hadronic systems may be possible d ..."
Abstract
 Add to MetaCart
In the reaction e + e − → WW → (q1¯q2)(q3¯q4) the usual hadronization models treat the colour singlets q1¯q2 and q3¯q4 coming from two W bosons independently. However, since the final state partons may coexist in space and time, crosstalk between the two evolving hadronic systems may be possible during fragmentation through soft gluon exchange. This effect is known as Colour Reconnection. In this article the results of the investigation of Colour Reconnection effects in fully hadronic decays of W pairs in DELPHI at LEP are presented. Two complementary analyses were performed, studying the particle flow between jets and W mass estimators, with negligible correlation between them, and the results were combined and compared to models. In the framework of the SKI model, the value for its κ parameter most compatible with the data was found to be: κSK−I = 2.2 +2.5 −1.3
Review of the Properties of the W Boson at LEP, and the Precision Determination of its Mass
, 2008
"... We review the precision measurement of the mass and couplings of the W Boson at LEP. The total and differential W + W − cross section is used to extract the WWZ and WWγ couplings. We discuss the techniques used by the four LEP experiments to determine the W mass in different decay channels, and pres ..."
Abstract
 Add to MetaCart
We review the precision measurement of the mass and couplings of the W Boson at LEP. The total and differential W + W − cross section is used to extract the WWZ and WWγ couplings. We discuss the techniques used by the four LEP experiments to determine the W mass in different decay channels, and present the details of methods used to evaluate the sources of systematic uncertainty.
ELECTROWEAK CORRECTIONS UNCERTAINTY ON THE W MASS MEASUREMENT AT LEP
, 2005
"... The systematic uncertainty on the W mass and width measurement resulting from the imperfect knowledge of electroweak radiative corrections is discussed. The intrinsic uncertainty in the 4f generator used by the DELPHI Collaboration is studied following the guidelines of the authors of YFSWW, on whi ..."
Abstract
 Add to MetaCart
The systematic uncertainty on the W mass and width measurement resulting from the imperfect knowledge of electroweak radiative corrections is discussed. The intrinsic uncertainty in the 4f generator used by the DELPHI Collaboration is studied following the guidelines of the authors of YFSWW, on which its radiative corrections part is based. The full DELPHI simulation, reconstruction and analysis chain is used for the uncertainty assessment. A comparison with the other available 4f calculation implementing DPA O(α) corrections, RacoonWW, is also presented. The uncertainty on the W mass is found to be below 10 MeV for all the WW decay channels used in the measurement. PACS:12.15.Lk,13.38.Be;13.40.Ks,13.66.Jn,14.70.Fm Published by SISPubblicazioni