Results 11  20
of
51
Practical feature selection: from correlation to causality. Mining Massive Data Sets for Security
 Advances in Data Mining, Search, Social Networks and Text Mining, and their Applications to Security
, 2008
"... Feature selection encompasses a wide variety of methods for selecting a restricted number of input variables or “features”, which are “relevant ” to a problem at hand. In this report, we guide practitioners through the maze of methods, which have recently appeared in the literature, particularly for ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Feature selection encompasses a wide variety of methods for selecting a restricted number of input variables or “features”, which are “relevant ” to a problem at hand. In this report, we guide practitioners through the maze of methods, which have recently appeared in the literature, particularly for supervised feature selection. Starting from the simplest methods of feature ranking with correlation coefficients, we branch in various direction and explore various topics, including “conditional relevance”, “local relevance”, “multivariate selection”, and “causal relevance”. We make recommendations for assessment methods and stress the importance of matching the complexity of the method employed to the available amount of training data. Software and teaching material associated with this tutorial are available [12].
Dimensionality reduction by Canonical Contextual Correlation projections
 the 8th European Conference on Computer Vision, pp.562
, 2004
"... Abstract. A linear, discriminative, supervised technique for reducing feature vectors extracted from image data to a lowerdimensional representation is proposed. It is derived from classical Fisher linear discriminant analysis (LDA) and useful, for example, in supervised segmentation tasks in which ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract. A linear, discriminative, supervised technique for reducing feature vectors extracted from image data to a lowerdimensional representation is proposed. It is derived from classical Fisher linear discriminant analysis (LDA) and useful, for example, in supervised segmentation tasks in which highdimensional feature vector describes the local structure of the image. In general, the main idea of the technique is applicable in discriminative and statistical modelling that involves contextual data. LDA is a basic, wellknown and useful technique in many applications. Our contribution is that we extend the use of LDA to cases where there is dependency between the output variables, i.e., the class labels, and not only between the input variables. The latter can be dealt with in standard LDA. The principal idea is that where standard LDA merely takes into account a single class label for every feature vector, the new technique incorporates class labels of its neighborhood in its analysis as well. In this way, the spatial class label configuration in the vicinity of every feature vector is accounted for, resulting
Estimation of fitness landscape contours in EAs
 in Proceedings of GECCO2007, 2007
"... Evolutionary algorithms applied in real domain should profit from information about the local fitness function curvature. This paper presents an initial study of an evolutionary strategy with a novel approach for learning the covariance matrix of a Gaussian distribution. The learning method is based ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Evolutionary algorithms applied in real domain should profit from information about the local fitness function curvature. This paper presents an initial study of an evolutionary strategy with a novel approach for learning the covariance matrix of a Gaussian distribution. The learning method is based on estimation of the fitness landscape contour line between the selected and discarded individuals. The distribution learned this way is then used to generate new population members. The algorithm presented here is the first attempt to construct the Gaussian distribution this way and should be considered only a proof of concept; nevertheless, the empirical comparison on lowdimensional quadratic functions shows that our approach is viable and with respect to the number of evaluations needed to find a solution of certain quality, it is comparable to the stateoftheart CMAES in case of sphere function and outperforms the CMAES in case of elliptical function.
A gprior extension for p> n
, 801
"... For the normal linear model regression setup, Zellner’s gprior is extended for the case where the number of predictors p exceeds the number of observations n. Exact analytical calculation of the marginal density under this prior is seen to lead to a new closed form variable selection criterion. Thi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
For the normal linear model regression setup, Zellner’s gprior is extended for the case where the number of predictors p exceeds the number of observations n. Exact analytical calculation of the marginal density under this prior is seen to lead to a new closed form variable selection criterion. This results are also applicable to the multivariate regression setup.
Risk Bounds for Embedded Variable Selection in Classification Trees
, 2012
"... The problems of model and variable selections for classification trees are jointly considered. A penalized criterion is proposed which explicitly takes into account the number of variables, and a risk bound inequality is provided for the tree classifier minimizing this criterion. This penalized crit ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
The problems of model and variable selections for classification trees are jointly considered. A penalized criterion is proposed which explicitly takes into account the number of variables, and a risk bound inequality is provided for the tree classifier minimizing this criterion. This penalized criterion is compared to the one used during the pruning step of the CART algorithm. It is shown that the two criteria are similar under some specific margin assumptions. In practice, the tuning parameter of the CART penalty has to be calibrated by holdout. Simulation studies are performed which confirm that the holdout procedure mimics the form of the proposed penalized criterion. Keywords: Theory
Annotationbased Distance Measures for Patient Subgroup Discovery in Clinical Microarray Studies
"... Motivation: Clustering algorithms are widely used in the analysis of microarray data. In clinical studies, they are often applied to find groups of coregulated genes. Clustering, however, can also stratify patients by similarity of their gene expression profiles, thereby defining novel disease enti ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Motivation: Clustering algorithms are widely used in the analysis of microarray data. In clinical studies, they are often applied to find groups of coregulated genes. Clustering, however, can also stratify patients by similarity of their gene expression profiles, thereby defining novel disease entities based on molecular characteristics. Several distancebased cluster algorithms have been suggested, but little attention has been given to the distance measure between patients. Even with the Euclidean metric, including and excluding genes from the analysis leads to different distances between the same objects, and consequently different clustering results. Results: We describe a new clustering algorithm, in which gene selection is used to derive biologically meaningful clusterings of samples by combining expression profiles and functional annotation data. According to gene annotations, candidate gene sets with specific functional characterizations are generated. Each set defines a different distance measure between patients, leading to different clusterings. These clusterings are filtered using a resampling based significance measure. Significant clusterings are reported together with the underlying gene sets and their functional definition. Conclusions: Our method reports clusterings defined by biologically focused sets of genes. In annotation driven clusterings, we have recovered clinically relevant patient subgroups through biologically plausible sets of genes, as well as new subgroupings. We conjecture that our method has the potential to reveal so far unknown, clinically relevant classes of patients in an unsupervised manner. Availability: We provide the R package adSplit as part of Bioconductor release 1.9 and on
Title: High dimensional multiclass classification with applications to cancer diagnosis
"... Probabilistic classifiers are introduced and it is shown that the only regular linear probabilistic classifier with convex risk is multinomial regression. Penalized empirical risk minimization is introduced and used to construct supervised learning methods for probabilistic classifiers. A sparse gro ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Probabilistic classifiers are introduced and it is shown that the only regular linear probabilistic classifier with convex risk is multinomial regression. Penalized empirical risk minimization is introduced and used to construct supervised learning methods for probabilistic classifiers. A sparse group lasso penalized approach to high dimensional multinomial classification is presented. On different real data examples it is found that this approach clearly outperforms multinomial lasso in terms of error rate and features included in the model. An efficient coordinate descent algorithm is developed and the convergence is established. This algorithm is implemented in the msgl R package. Examples of high dimensional multiclass problems are studied, in particular examples of multiclass classification based on gene expression measurements. One such example is the – clinically important – problem of identifying the primary tumor site of lever metastases, this particular problem is studied in detail. In order to adjust for the lever contamination found in biopsies of metastases a computational contamination model is develop. The contamination model is presented in a domain adaption framework and a simulation based domain adaption strategy is presented. It is shown that the presented computational contamination approach
Probabilistic Models for Melodic Prediction
, 2008
"... submitted for publication Abstract. Chord progressions are the building blocks from which tonal music is constructed. The choice of a particular representation for chords has a strong impact on statistical modeling of the dependence between chord symbols and the actual sequences of notes in polyphon ..."
Abstract
 Add to MetaCart
(Show Context)
submitted for publication Abstract. Chord progressions are the building blocks from which tonal music is constructed. The choice of a particular representation for chords has a strong impact on statistical modeling of the dependence between chord symbols and the actual sequences of notes in polyphonic music. Melodic prediction is used in this paper as a benchmark task to evaluate the quality of four chord representations using two probabilistic model architectures derived from Input/Output Hidden Markov Models (IOHMMs).2 IDIAP–RR 0850 1