Results 1  10
of
12
The variable selection problem
 Journal of the American Statistical Association
, 2000
"... The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables ..."
Abstract

Cited by 39 (2 self)
 Add to MetaCart
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This vignette reviews some of the key developments which have led to the wide variety of approaches for this problem. 1
Feature Selection with Neural Networks
 Behaviormetrika
, 1998
"... Features gathered from the observation of a phenomenon are not all equally informative: some of them may be noisy, correlated or irrelevant. Feature selection aims at selecting a feature set that is relevant for a given task. This problem is complex and remains an important issue in many domains. In ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
Features gathered from the observation of a phenomenon are not all equally informative: some of them may be noisy, correlated or irrelevant. Feature selection aims at selecting a feature set that is relevant for a given task. This problem is complex and remains an important issue in many domains. In the field of neural networks, feature selection has been studied for the last ten years and classical as well as original methods have been employed. This paper is a review of neural network approaches to feature selection. We first briefly introduce baseline statistical methods used in regression and classification. We then describe families of methods which have been developed specifically for neural networks. Representative methods are then compared on different test problems. Keywords Feature Selection, Subset selection, Variable Sensitivity, Sequential Search Sélection de Variables et Réseaux de Neurones Philippe LERAY et Patrick GALLINARI Résumé Les données collectées lors de l'obse...
Nonparametric Selection of Input Variables for Connectionist Learning
, 1996
"... re. However, for a range of explored problems, the relative ordering of mutual information estimates remains correct, despite inaccuracies in individual estimates. Analysis of forward selection explores the amount of data required to select a certain number of relevant input variables. It is shown t ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
re. However, for a range of explored problems, the relative ordering of mutual information estimates remains correct, despite inaccuracies in individual estimates. Analysis of forward selection explores the amount of data required to select a certain number of relevant input variables. It is shown that in order to select a certain number of relevant input variables, the amount of required data increases roughly exponentially as more relevant input variables are considered. It is also shown that the chances of forward selection ending up in a local minimum are reduced by bootstrapping the data. Finally, the method is compared to two connectionist methods for input variable selection: Sensitivity Based Pruning and Automatic Relevance Determination. It is shown that the new method outperforms these two when the number of independent, candidate input variables is large. However, the method requires the number of relevant input variables to be relatively small. These results are confirmed o
A New Approach to Variable Selection Using the TLS Approach
"... Abstract—The problem of variable selection is one of the most important model selection problems in statistical applications. It is also known as the subset selection problem and arises when one wants to explain the observations or data adequately by a subset of possible explanatory variables. The o ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract—The problem of variable selection is one of the most important model selection problems in statistical applications. It is also known as the subset selection problem and arises when one wants to explain the observations or data adequately by a subset of possible explanatory variables. The objective is to identify factors of importance and to include only variables that contribute significantly to the reduction of the prediction error. Numerous selection procedures have been proposed in the classical multiple linear regression model. We extend one of the most popular methods developed in this context, the backward selection procedure, to a more general class of models. In the basic linear regression model, errors are present on the observations only, if errors are present on the regressors as well, one gets the errorsinvariables model which for Gaussian noise becomes the totalleastsquares (TLS) model, this is the context considered here. Index Terms—Least squares (LS) problem, matrix perturbation, stepwise regression, Student test, subset selection, total least squares (TLS) problem. I.
A new approach to fitting linear models in high dimensional spaces
, 2000
"... This thesis presents a new approach to fitting linear models, called “pace regression”, which also overcomes the dimensionality determination problem. Its optimality in minimizing the expected prediction loss is theoretically established, when the number of free parameters is infinitely large. In th ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This thesis presents a new approach to fitting linear models, called “pace regression”, which also overcomes the dimensionality determination problem. Its optimality in minimizing the expected prediction loss is theoretically established, when the number of free parameters is infinitely large. In this sense, pace regression outperforms existing procedures for fitting linear models. Dimensionality determination, a special case of fitting linear models, turns out to be a natural byproduct. A range of simulation studies are conducted; the results support the theoretical analysis. Through the thesis, a deeper understanding is gained of the problem of fitting linear models. Many key issues are discussed. Existing procedures, namely OLS, AIC, BIC, RIC, CIC, CV(d), BS(m), RIDGE, NNGAROTTE and LASSO, are reviewed and compared, both theoretically and empirically, with the new methods. Estimating a mixing distribution is an indispensable part of pace regression. A measurebased minimum distance approach, including probability measures and nonnegative measures, is proposed, and strongly consistent estimators are produced. Of all minimum distance methods for estimating a mixing distribution, only the
Pace Regression
, 1999
"... This paper articulates a new method of linear regression, \pace regression," that addresses many drawbacks of standard regression reported in the literatureparticularly the subset selection problem. Pace regression improves on classical ordinary least squares (ols) regression by evaluating the ee ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This paper articulates a new method of linear regression, \pace regression," that addresses many drawbacks of standard regression reported in the literatureparticularly the subset selection problem. Pace regression improves on classical ordinary least squares (ols) regression by evaluating the eect of each variable and using a clustering analysis to improve the statistical basis for estimating their contribution to the overall regression. As well as outperforming ols, it also outperformsin a remarkably general senseother linear modeling techniques in the literature, including subset selection procedures, which seek a reduction in dimensionality that falls out as a natural byproduct of pace regression. The paper denes six procedures that share the fundamental idea of pace regression, all of which are theoretically justied in terms of asymptotic performance. Experiments conrm the performance improvement over other techniques. Keywords: Linear regression; subset model sele...
Perceived Time as a Measure of Mental Workload: Effects of Time Constraints and Task Success
"... The mental workload imposed by systems is important to their operation and usability. Consequently, researchers and practitioners need reliable, valid, and easytoadminister methods for measuring mental workload. The ratio of perceived time to clock time appears to be such a method, yet mental work ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
The mental workload imposed by systems is important to their operation and usability. Consequently, researchers and practitioners need reliable, valid, and easytoadminister methods for measuring mental workload. The ratio of perceived time to clock time appears to be such a method, yet mental workload has multiple dimensions of which the perceived time ratio has mainly been linked to the taskrelated dimension. This study investigates how the perceived time ratio is affected by time constraints, which make time an explicit concern in the execution of tasks, and task success, which is a performancerelated rather than taskrelated dimension of mental workload. A higher perceived time ratio is found for timed than untimed tasks. According to subjective workload ratings and pupildiameter measurements, the timed tasks impose higher mental workload. This finding contradicts the prospective paradigm, which asserts that perceived time decreases with increasing mental workload. A higher perceived time ratio was also found for solved than unsolved tasks, whereas subjective workload ratings indicate lower mental workload for the solved tasks. This finding shows that the relationship between the perceived time ratio and mental workload is reversed for task success compared to time constraints. Implications for the use of perceived time as a measure of mental workload are discussed. 1.
Environment and Climate DG XII
"... this document, we try to present all these methods using unified notations and definitions. The reader may refer to the next page for a global view of our notations. Basic ingredients of feature selection methods. A feature selection technique typically requires the following ingredients:  a featur ..."
Abstract
 Add to MetaCart
this document, we try to present all these methods using unified notations and definitions. The reader may refer to the next page for a global view of our notations. Basic ingredients of feature selection methods. A feature selection technique typically requires the following ingredients:  a feature evaluation criterion to compare subsets of variables, it will be used to perform a choice on the variables,  a search procedure, to search the set of possible variable combinations,  a stop criterion, which could be a significance threshold in the evaluation criterion or the final feature space dimension. Depending on the task (e.g. prediction or classification) and on the model (linear, logistic, neural networks...), several evaluation criteria, based either on sound statistical grounds or heuristics, have been proposed for measuring the importance of a variable subset. For classification, classical criteria use probabilistic distances or entropy measures, often replaced in practice by simple interclass distance measures or even simple distances. For approximation or prediction, classical candidates are distance measures. Some methods consider only the data for computing the relevant variables, others take into account the model which will be used for the modelization task. In this case, the evaluation criterion may be based on the performances of the model for the candidate subset of variables. Several measures of performance do exist, this point is non trivial but will not be discussed further here. In general, evaluation criteria are non monotonic, and exact comparison of feature subsets amounts to a combinatorial problem, which rapidly becomes computationally unfeasible, even for moderate input size. Due to these limitations, most algorithms are based upon heuristic ...
complexity of
, 2007
"... Evaluation and selection of models for outofsample prediction when the sample size is small relative to the ..."
Abstract
 Add to MetaCart
Evaluation and selection of models for outofsample prediction when the sample size is small relative to the
Author manuscript, published in "International conference on document image analysis, Washington: United States (2013)" Quality evaluation of ancient digitized documents for binarization prediction
, 2013
"... Abstract—This article proposes an approach to predict the result of binarization algorithms on a given document image according to its state of degradation. Indeed, historical documents suffer from different types of degradation which result in binarization errors. We intend to characterize the degr ..."
Abstract
 Add to MetaCart
Abstract—This article proposes an approach to predict the result of binarization algorithms on a given document image according to its state of degradation. Indeed, historical documents suffer from different types of degradation which result in binarization errors. We intend to characterize the degradation of a document image by using different features based on the intensity, quantity and location of the degradation. These features allow us to build prediction models of binarization algorithms that are very accurate according to R 2 values and pvalues. The prediction models are used to select the best binarization algorithm for a given document image. Obviously, this imagebyimage strategy improves the binarization of the entire dataset. I.