Results 1 - 10
of
14
Combining Estimates in Regression and Classification
- Journal of the American Statistical Association
, 1993
"... We consider the problem of how to combine a collection of general regression fit vectors in order to obtain a better predictive model. The individual fits may be from subset linear regression, ridge regression, or something more complex like a neural network. We develop a general framework for this ..."
Abstract
-
Cited by 66 (0 self)
- Add to MetaCart
We consider the problem of how to combine a collection of general regression fit vectors in order to obtain a better predictive model. The individual fits may be from subset linear regression, ridge regression, or something more complex like a neural network. We develop a general framework for this problem and examine a recent cross-validation-based proposal called "stacking" in this context. Combination methods based on the bootstrap and analytic methods are also derived and compared in a number of examples, including best subsets regression and regression trees. Finally, we apply these ideas to classification problems where the estimated combination weights can yield insight into the structure of the problem. 1 Introduction Consider a standard regression setup: we have predictor measurements x i = (x i1 ; . . . x ip ) T and a response measurement y i on N independent training cases. Let z represent the entire training sample. Our goal is derive a function c z (x) that accurately p...
Minimizing Statistical Bias with Queries
, 1995
"... I describe an exploration criterion that attempts to minimize the error of a learner by minimizing its estimated squared bias. I describe experiments with locally-weighted regression on two simple kinematics problems, and observe that this "bias-only" approach outperforms the more common "variance ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
I describe an exploration criterion that attempts to minimize the error of a learner by minimizing its estimated squared bias. I describe experiments with locally-weighted regression on two simple kinematics problems, and observe that this "bias-only" approach outperforms the more common "variance-only" exploration approach, even in the presence of noise.
The Covariance Inflation Criterion for Adaptive Model Selection
- J. Roy. Statist. Soc. B
, 1999
"... We propose a new criterion for model selection in prediction problems. The covariance inflation criterion adjusts the training error by the average covariance of the predictions and responses, when the prediction rule is applied to permuted versions of the dataset. This criterion can be applied to g ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
We propose a new criterion for model selection in prediction problems. The covariance inflation criterion adjusts the training error by the average covariance of the predictions and responses, when the prediction rule is applied to permuted versions of the dataset. This criterion can be applied to general prediction problems (for example regression or classification), and to general prediction rules (for example stepwise regression, tree-based models and neural nets). As a byproduct we obtain a measure of the effective number of parameters used by an adaptive procedure. We relate the covariance inflation criterion to other model selection procedures and illustrate its use in some regression and classification problems. We also revisit the conditional bootstrap approach to model selection. Keywords: model selection, adaptive, permutation, bootstrap, cross-validation 1 Introduction This article concerns the selection of a prediction rule from a set of training data. The training set z =...
The Out-of-Bootstrap Method for Model Averaging and Selection
, 1996
"... We propose a bootstrap-based method for model averaging and selection that focuses on training points that are left out of individual bootstrap samples. This information can be used to estimate optimal weighting factors for combining estimates from different bootstrap samples, and also for finding t ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
We propose a bootstrap-based method for model averaging and selection that focuses on training points that are left out of individual bootstrap samples. This information can be used to estimate optimal weighting factors for combining estimates from different bootstrap samples, and also for finding the best subsets the linear model setting. These proposals provide alternatives to Bayesian approaches to model averaging and selection, requiring less computation and fewer subjective choices. 1 Introduction In this article we use the bootstrap to attempt to "enjoy the Bayesian omelette" without making a mess in the kitchen. We try to mimic Bayesian Cleveland Clinic; srao@bio.ri.ccf.org y Department of Preventive Medicine and Biostatistics, and Department of Statistics, University of Toronto; tibs@utstat.toronto.edu methods for model averaging and selection without having to impose full Bayesian structure. We consider the prediction problem in which we have a training set X = (X 1 ...
Model Selection by Normalized Maximum Likelihood
, 2005
"... The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘two-part code ’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.
Selection of Tree-based Classifiers with the Bootstrap 632+ Rule
- Biometrical Journal
, 1997
"... This paper introduces a novel model selection procedure for tree-based classifiers. The method is based on the bootstrap 632+ rule recently proposed by Efron and Tibishirani. The rule allows selecting compact, non-overfitting classification trees by reweighting the contributions of the resubstitutio ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
This paper introduces a novel model selection procedure for tree-based classifiers. The method is based on the bootstrap 632+ rule recently proposed by Efron and Tibishirani. The rule allows selecting compact, non-overfitting classification trees by reweighting the contributions of the resubstitution and standard bootstrap estimated errors. The proposed method is applied in a medical entomology problem for modeling the risk of parasite presence. Keywords: bootstrap 632+, model selection, classification and regression trees 1 Introduction
A Connectionist Approach to Quality Assessment of Food Products
, 1995
"... Colour development of a product is often vital in the food industry. The study of the baking of biscuits reveals interesting colour development characteristic curves. Neural network methods are used to both represent and classify products according to these characteristics. Using self-organising map ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Colour development of a product is often vital in the food industry. The study of the baking of biscuits reveals interesting colour development characteristic curves. Neural network methods are used to both represent and classify products according to these characteristics. Using self-organising maps well-defined characteristic curves are extracted. Colour data histogrammed along these curves are then accurately classified by feedforward neural networks trained by backpropagation. Image segmentation is implicit within this colour histogramming technique. The overall system has been shown to considerably outperform a human expert. Keywords Image Analysis, Neural Networks, Self-Organising Maps, Gaussian Filtering, Backpropagation. 1 Introduction An important criterion for the assessment of quality of a food product is its colour. Skilled human judgement of food product quality by inspection of its colour is liable to both short-term and long-term inconsistencies. On the other hand an ...
DETECTION OF ACUTE CORONARY SYNDROMES IN CHEST PAIN PATIENTS USING NEURAL NETWORK ENSEMBLES
"... Abstract — Patients with suspicion of acute coronary syndrome (ACS) are difficult to diagnose and they belong to a very heterogenous group of patients. Some require immediate treatment while others, with only minor disorders, may be sent home. Detecting ACS patients using a machine learning approach ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract — Patients with suspicion of acute coronary syndrome (ACS) are difficult to diagnose and they belong to a very heterogenous group of patients. Some require immediate treatment while others, with only minor disorders, may be sent home. Detecting ACS patients using a machine learning approach would be advantageous in many situations. This study is based on patients with chest pain attending the emergency department of Lund University Hospital. A total of 915 cases were incorporated of which 190 were diagnosed as ACS and 725 as non ACS. We have developed classifiers using neural network ensembles that can provide a prediction of ACS for patients with chest pain at an emergency department. We compared two different ensemble strategies, Bagging and K-fold cross splitting. The obtained results were also compared with the results of a standard multiple logistic regression model. Our results show that it is possible to construct a machine learning tool that can predict the presence of ACS among patients with chest pain at a ROC area of 77.8%, corresponding to a level of 40 % specificity and 95 % sensitivity.
An Application of the Bootstrap 632+ Rule to Ecological Data
, 1997
"... We applied the novel bootstrap 632 rule to choose tree-based classifiers trained for modeling the risk of parasite presence in a host population of ungulates. The method is designed to control overfitting: compact classification trees (CART) are selected using a nonlinear combination of the r ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We applied the novel bootstrap 632 rule to choose tree-based classifiers trained for modeling the risk of parasite presence in a host population of ungulates. The method is designed to control overfitting: compact classification trees (CART) are selected using a nonlinear combination of the resubstitution error and the standard bootstrap error estimate. Model selection based on the 632 rule offers a gain over cross-validation for CART models. The tree classifier selected by the new rule for this application favourably compared with standard multivariate GLIM models.
Bootstrapping Likelihood for Model Selection with Small Samples
, 1998
"... this report we compare model-selection performance of AIC, EIC, a bootstrap-smoothed likelihood crossvalidation (BCV) and its modification (632CV) in small-sample linear regression, logistic regression and Cox regression. Simulation results show that EIC largely overcomes AIC's over-fitting problem ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
this report we compare model-selection performance of AIC, EIC, a bootstrap-smoothed likelihood crossvalidation (BCV) and its modification (632CV) in small-sample linear regression, logistic regression and Cox regression. Simulation results show that EIC largely overcomes AIC's over-fitting problem and that BCV may be better than EIC. Hence, the three methods based on bootstrapping the likelihood establish themselves as important alternatives to AIC in model selection with small samples. Key words: AIC; Cox regression; Cross-validation; EIC; Linear regression; Logistic regression; Maximum likelihood. 1. INTRODUCTION

