Results 1  10
of
49
Bagging Predictors
 Machine Learning
, 1996
"... Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making ..."
Abstract

Cited by 2479 (1 self)
 Add to MetaCart
Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy. 1. Introduction A learning set of L consists of data f(y n ; x n ), n = 1; : : : ; Ng where the y's are either class labels or a numerical response. We have a procedure for using this learning set to form a predictor '(x; L)  if the input is x we ...
A Study of CrossValidation and Bootstrap for Accuracy Estimation and Model Selection
 INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 1995
"... We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on artificial data and theoretical results in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), te ..."
Abstract

Cited by 749 (12 self)
 Add to MetaCart
We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on artificial data and theoretical results in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), tenfold crossvalidation may be better than the more expensive leaveoneout crossvalidation. We report on a largescale experiment  over half a million runs of C4.5 and a NaiveBayes algorithm  to estimate the effects of different parameters on these algorithms on realworld datasets. For crossvalidation, we vary the number of folds and whether the folds are stratified or not; for bootstrap, we vary the number of bootstrap samples. Our results indicate that for realword datasets similar to ours, the best method to use for model selection is tenfold stratified cross validation, even if computation power allows using more folds.
Wrappers For Performance Enhancement And Oblivious Decision Graphs
, 1995
"... In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are stu ..."
Abstract

Cited by 107 (8 self)
 Add to MetaCart
In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are studied under the wrapper approach. The hypothesis spaces we investigate are: decision tables with a default majority rule (DTMs) and oblivious readonce decision graphs (OODGs).
Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data
 Bioinformatics
, 2003
"... ..."
CrossValidation and the Bootstrap: Estimating the Error Rate of a Prediction Rule
, 1995
"... A training set of data has been used to construct a rule for predicting future responses. What is the error rate of this rule? The traditional answer to this question is given by crossvalidation. The crossvalidation estimate of prediction error is nearly unbiased, but can be highly variable. This ..."
Abstract

Cited by 38 (0 self)
 Add to MetaCart
A training set of data has been used to construct a rule for predicting future responses. What is the error rate of this rule? The traditional answer to this question is given by crossvalidation. The crossvalidation estimate of prediction error is nearly unbiased, but can be highly variable. This article discusses bootstrap estimates of prediction error, which can be thought of as smoothed versions of crossvalidation. A particular bootstrap method, the 632+ rule, is shown to substantially outperform crossvalidation in a catalog of 24 simulation experiments. Besides providing point estimates, we also consider estimating the variability of an error rate estimate. All of the results here are nonparametric, and apply to any possible prediction rule: however we only study classification problems with 01 loss in detail. Our simulations include "smooth" prediction rules like Fisher's Linear Discriminant Function, and unsmooth ones like Nearest Neighbors.
Small Sample Statistics for Classification Error Rates I: Error Rate Measurements
 Dept. of Inf. and Comp. Sci
, 1996
"... Several methods (independent subsamples, leaveoneout, crossvalidation, and bootstrapping) have been proposed for estimating the error rates of classifiers. The rationale behind the various estimators and the causes of the sometimes conflicting claims regarding their bias and precision are explore ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
Several methods (independent subsamples, leaveoneout, crossvalidation, and bootstrapping) have been proposed for estimating the error rates of classifiers. The rationale behind the various estimators and the causes of the sometimes conflicting claims regarding their bias and precision are explored in this paper. The biases and variances of each of the estimators are examined empirically. Crossvalidation, 10fold or greater, seems to be the best approach; the other methods are biased, have poorer precision, or are inconsistent. Though unbiased for linear discriminant classifiers, the 632b bootstrap estimator is biased for nearest neighbors classifiers, more so for single nearest neighbor than for three nearest neighbors. The 632b estimator is also biased for Cartstyle decision trees. Weiss' loo* estimator is unbiased and has better precision than crossvalidation for discriminant and nearest neighbors classifiers, but its lack of bias and improved precision for those classifiers do...
Towards Perceptual Intelligence: Statistical Modeling of Human Individual and Interactive Behaviors
 Prediction of Human Behavior, IEEE Intelligent Vehicles
, 1995
"... This thesis presents a computational framework for the automatic recognition and prediction of different kinds of human behaviors from video cameras and other sensors, via perceptually intelligent systems that automatically sense and correctly classify human behaviors, by means of Machine Perception ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
This thesis presents a computational framework for the automatic recognition and prediction of different kinds of human behaviors from video cameras and other sensors, via perceptually intelligent systems that automatically sense and correctly classify human behaviors, by means of Machine Perception and Machine Learning techniques. In the thesis I develop the statistical machine learning algorithms (dynamic graphical models) necessary for detecting and recognizing individual and interactive behaviors. In the case of the interactions two Hidden Markov Models (HMMs) are coupled in a novel architecture called Coupled Hidden Markov Models (CHMMs) that explicitly captures the interactions between them. The algorithms for learning the parameters from data as well as for doing inference with those models are developed and described. Four systems that experimentally evaluate the proposed paradigm are presented: (1) LAFTER, an automatic face detection and tracking system with facial expression recognition; (2) a TaiChi gesture recognition system; (3) a pedestrian surveillance system that recognizes typical human to human interactions; (4) and a SmartCar for driver maneuver recognition. These systems capture human behaviors of different nature and increasing complexity: first, isolated, singleuser facial expressions, then, twohand gestures and humantohuman interactions,...
Gene Expression Data Analysis of Human Lymphoma Using Support Vector Machines and Output Coding Ensembles
, 2001
"... The large amount of data generated by DNA microarrays was originally analysed using unsupervised methods, such as clustering or selforganizing maps. Recently supervised methods such as decision trees, dotproduct support vector machines (SVM) and multilayer perceptrons (MLP) have been applied in o ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
The large amount of data generated by DNA microarrays was originally analysed using unsupervised methods, such as clustering or selforganizing maps. Recently supervised methods such as decision trees, dotproduct support vector machines (SVM) and multilayer perceptrons (MLP) have been applied in order to classify normal and tumoural tissues. We propose methods based on nonlinear SVM with polynomial and gaussian kernels, and output coding (OC) ensembles of learning machines to separate normal from malignant tissues, to classify dierent types of lymphoma and to analyse the role of sets of coordinately expressed genes in carcinogenic processes of lymphoid tissues. Using gene expression data from "Lymphochip", a specialized DNA microarray developed at Stanford University School of Medicine, we show that SVM can correctly separate normal from tumoural tissues, and OC ensembles can be successfully used to classify dierent types of lymphoma. Moreover, we identify a group of coordinately expressed genes related to the separation of two distinct subgroups inside Diuse Large BCell Lymphoma (DLBCL), validating a previous Alizadeh's hypothesis about the existence of two distinct diseases inside DLBCL.
Controlling Variable Selection By the Addition of PseudoVariables
 Journal of the American Statistical Association
, 2006
"... We propose a new approach to variable selection designed to control the false selection rate (FSR), defined as the proportion of uninformative variables included in selected models. The method works by adding a known number of pseudo variables to the real data set, running a variable selection proce ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We propose a new approach to variable selection designed to control the false selection rate (FSR), defined as the proportion of uninformative variables included in selected models. The method works by adding a known number of pseudo variables to the real data set, running a variable selection procedure, and monitoring the proportion of pseudo variables falsely selected. Information obtained from bootstraplike replications of this process is used to estimate the proportion of falselyselected real variables and to tune the selection procedure to control the false selection rate. KEY WORDS: False selection rate; forward selection; model error; model selection; subset selection.