Results 1  10
of
28
An introduction to variable and feature selection
 Journal of Machine Learning Research
, 2003
"... Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. ..."
Abstract

Cited by 698 (14 self)
 Add to MetaCart
Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available.
SPECTRUM ESTIMATION FOR LARGE DIMENSIONAL COVARIANCE MATRICES USING RANDOM MATRIX THEORY
 SUBMITTED TO THE ANNALS OF STATISTICS
"... Estimating the eigenvalues of a population covariance matrix from a sample covariance matrix is a problem of fundamental importance in multivariate statistics; the eigenvalues of covariance matrices play a key role in many widely techniques, in particular in Principal Component Analysis (PCA). In ma ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
Estimating the eigenvalues of a population covariance matrix from a sample covariance matrix is a problem of fundamental importance in multivariate statistics; the eigenvalues of covariance matrices play a key role in many widely techniques, in particular in Principal Component Analysis (PCA). In many modern data analysis problems, statisticians are faced with large datasets where the sample size, n, is of the same order of magnitude as the number of variables p. Random matrix theory predicts that in this context, the eigenvalues of the sample covariance matrix are not good estimators of the eigenvalues of the population covariance. We propose to use a fundamental result in random matrix theory, the MarčenkoPastur equation, to better estimate the eigenvalues of large dimensional covariance matrices. The MarčenkoPastur equation holds in very wide generality and under weak assumptions. The estimator we obtain can be thought of as “shrinking ” in a non linear fashion the eigenvalues of the sample covariance matrix to estimate the population eigenvalues. Inspired by ideas of random matrix theory, we also suggest a change of point of view when thinking about estimation of highdimensional vectors: we do not try to estimate directly the vectors but rather a probability measure that describes them. We think this is a theoretically more fruitful way to think about these problems. Our estimator gives fast and good or very good results in extended simulations. Our algorithmic approach is based on convex optimization. We also show that the proposed estimator is consistent.
Detecting Correlation in Stock Market
 Physica A: Statistical Mechanics and its Applications, Volume 344, Issues 12
, 2004
"... We present a new method for detecting dependencies in the stock market. In order to find hidden correlations in the daily returns, we build cross prediction models and use the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix. ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
We present a new method for detecting dependencies in the stock market. In order to find hidden correlations in the daily returns, we build cross prediction models and use the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix.
Tradeoffs in the Empirical Evaluation of Competing Algorithm Designs
"... Abstract. We propose an empirical analysis approach for characterizing tradeoffs between different methods for comparing a set of competing algorithm designs. Our approach can provide insight into performance variation both across candidate algorithms and across instances. It can also identify the b ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Abstract. We propose an empirical analysis approach for characterizing tradeoffs between different methods for comparing a set of competing algorithm designs. Our approach can provide insight into performance variation both across candidate algorithms and across instances. It can also identify the best tradeoff between evaluating a larger number of candidate algorithm designs, performing these evaluations on a larger number of problem instances, and allocating more time to each algorithm run. We applied our approach to a study of the rich algorithm design spaces offered by three highlyparameterized, stateoftheart algorithms for satisfiability and mixed integer programming, considering six different distributions of problem instances. We demonstrate that the resulting algorithm design scenarios differ in many ways, with important consequences for both automatic and manual algorithm design. We expect that both our methods and our findings will lead to tangible improvements in algorithm design methods.
VFOLD CROSSVALIDATION IMPROVED: VFOLD PENALIZATION
 SUBMITTED TO THE ANNALS OF STATISTICS
, 2008
"... We study the efficiency of Vfold crossvalidation (VFCV) for model selection from the nonasymptotic viewpoint, and suggest an improvement on it, which we call “Vfold penalization”. Considering a particular (though simple) regression problem, we prove that VFCV with a bounded V is suboptimal for m ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We study the efficiency of Vfold crossvalidation (VFCV) for model selection from the nonasymptotic viewpoint, and suggest an improvement on it, which we call “Vfold penalization”. Considering a particular (though simple) regression problem, we prove that VFCV with a bounded V is suboptimal for model selection, because it “overpenalizes ” all the more that V is large. Hence, asymptotic optimality requires V to go to infinity. However, when the signaltonoise ratio is low, it appears that overpenalizing is necessary, so that the optimal V is not always the larger one, despite of the variability issue. This is confirmed by some simulated data. In order to improve on the prediction performance of VFCV, we define a new model selection procedure, called “Vfold penalization” (penVF). It is a Vfold subsampling version of Efron’s bootstrap penalties, so that it has the same computational cost as VFCV, while being more flexible. In a heteroscedastic regression framework, assuming the models to have a particular structure, we prove that penVF satisfies a nonasymptotic oracle inequality with a leading constant that tends to 1 when the sample size goes to infinity. In particular, this implies adaptivity to the smoothness of the regression function, even with a highly heteroscedastic noise. Moreover, it is easy to overpenalize with penVF, independently from the V parameter. A simulation study shows that this results in a significant improvement on VFCV in nonasymptotic situations.
Practical feature selection: from correlation to causality. Mining Massive Data Sets for Security
 Advances in Data Mining, Search, Social Networks and Text Mining, and their Applications to Security
, 2008
"... Feature selection encompasses a wide variety of methods for selecting a restricted number of input variables or “features”, which are “relevant ” to a problem at hand. In this report, we guide practitioners through the maze of methods, which have recently appeared in the literature, particularly for ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Feature selection encompasses a wide variety of methods for selecting a restricted number of input variables or “features”, which are “relevant ” to a problem at hand. In this report, we guide practitioners through the maze of methods, which have recently appeared in the literature, particularly for supervised feature selection. Starting from the simplest methods of feature ranking with correlation coefficients, we branch in various direction and explore various topics, including “conditional relevance”, “local relevance”, “multivariate selection”, and “causal relevance”. We make recommendations for assessment methods and stress the importance of matching the complexity of the method employed to the available amount of training data. Software and teaching material associated with this tutorial are available [12].
Dimensionality reduction by Canonical Contextual Correlation projections
 the 8th European Conference on Computer Vision, pp.562
, 2004
"... Abstract. A linear, discriminative, supervised technique for reducing feature vectors extracted from image data to a lowerdimensional representation is proposed. It is derived from classical Fisher linear discriminant analysis (LDA) and useful, for example, in supervised segmentation tasks in which ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. A linear, discriminative, supervised technique for reducing feature vectors extracted from image data to a lowerdimensional representation is proposed. It is derived from classical Fisher linear discriminant analysis (LDA) and useful, for example, in supervised segmentation tasks in which highdimensional feature vector describes the local structure of the image. In general, the main idea of the technique is applicable in discriminative and statistical modelling that involves contextual data. LDA is a basic, wellknown and useful technique in many applications. Our contribution is that we extend the use of LDA to cases where there is dependency between the output variables, i.e., the class labels, and not only between the input variables. The latter can be dealt with in standard LDA. The principal idea is that where standard LDA merely takes into account a single class label for every feature vector, the new technique incorporates class labels of its neighborhood in its analysis as well. In this way, the spatial class label configuration in the vicinity of every feature vector is accounted for, resulting
Prediction of the InterObserver Visual Congruency (IOVC) and application to image ranking
"... This paper proposes an automatic method for predicting the interobserver visual congruency (IOVC). The IOVC reflects the congruence or the variability among different subjects looking at the same image. Predicting this congruence is of interest for image processing applications where the visual per ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper proposes an automatic method for predicting the interobserver visual congruency (IOVC). The IOVC reflects the congruence or the variability among different subjects looking at the same image. Predicting this congruence is of interest for image processing applications where the visual perception of a picture matters such as website design, advertisement, etc. This paper makes several new contributions. First, a computational model of the IOVC is proposed. This new model is a mixture of lowlevel visual features extracted from the input picture where model’s parameters are learned by using a large eyetracking database. Once the parameters have been learned, it can be used for any new picture. Second, regarding lowlevel visual feature extraction, we propose a new scheme to compute the depth of field of a picture. Finally, once the training and the feature extraction have been carried out, a score ranging from 0 (minimal congruency) to 1 (maximal congruency) is computed. A value of 1 indicates that observers would focus on the same locations and suggests that the picture presents strong locations of interest. A second database of eye movements is used to assess the performance of the proposed model. Results show that our IOVC criterion outperforms the Feature Congestion measure [33]. To illustrate the interest of the proposed model, we have used it to automatically rank personalized photograph.
A gprior extension for p> n
, 801
"... For the normal linear model regression setup, Zellner’s gprior is extended for the case where the number of predictors p exceeds the number of observations n. Exact analytical calculation of the marginal density under this prior is seen to lead to a new closed form variable selection criterion. Thi ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
For the normal linear model regression setup, Zellner’s gprior is extended for the case where the number of predictors p exceeds the number of observations n. Exact analytical calculation of the marginal density under this prior is seen to lead to a new closed form variable selection criterion. This results are also applicable to the multivariate regression setup.