Results 1  10
of
38
Fast Binary Feature Selection with Conditional Mutual Information
 Journal of Machine Learning Research
, 2004
"... We propose in this paper a very fast feature selection technique based on conditional mutual information. ..."
Abstract

Cited by 101 (1 self)
 Add to MetaCart
We propose in this paper a very fast feature selection technique based on conditional mutual information.
Mutual Information in Learning Feature Transformations
 In Proceedings of the 17th International Conference on Machine Learning
, 2000
"... We present feature transformations useful for exploratory data analysis or for pattern recognition. Transformations are learned from example data sets by maximizing the mutual information between transformed data and their class labels. We make use of Renyi's quadratic entropy, and we extend the wor ..."
Abstract

Cited by 48 (8 self)
 Add to MetaCart
We present feature transformations useful for exploratory data analysis or for pattern recognition. Transformations are learned from example data sets by maximizing the mutual information between transformed data and their class labels. We make use of Renyi's quadratic entropy, and we extend the work of Principe et al. to mutual information between continuous multidimensional variables and discretevalued class labels. 1.
Feature Selection by Maximum Marginal Diversity: Optimality and Implications for Visual Recognition
 In submitted
, 2002
"... We address the question of feature selection in the context of visual recognition. It is shown that, besides efficient from a computational standpoint, the infomax principle is nearly optimal in the minimum Bayes error sense. The concept of marginal diversity is introduced, leading to a generic prin ..."
Abstract

Cited by 26 (7 self)
 Add to MetaCart
We address the question of feature selection in the context of visual recognition. It is shown that, besides efficient from a computational standpoint, the infomax principle is nearly optimal in the minimum Bayes error sense. The concept of marginal diversity is introduced, leading to a generic principle for feature selection (the principle of maximum marginal diversity) of extreme computational simplicity. The relationships between infomax and the maximization of marginal diversity are identified, uncovering the existence of a family of classification procedures for which near optimal (in the Bayes error sense) feature selection does not require combinatorial search. Examination of this family in light of recent studies on the statistics of natural images suggests that visual recognition problems are a subset of it. 1
Automatic feature selection in neuroevolution
 In Genetic and Evolutionary Computation Conference
, 2005
"... Abstract. Feature selection is the process of finding the set of inputs to a machine learning algorithm that will yield the best performance. Developing a way to solve this problem automatically would make current machine learning methods much more useful. Previous efforts to automate feature select ..."
Abstract

Cited by 22 (9 self)
 Add to MetaCart
Abstract. Feature selection is the process of finding the set of inputs to a machine learning algorithm that will yield the best performance. Developing a way to solve this problem automatically would make current machine learning methods much more useful. Previous efforts to automate feature selection rely on expensive metalearning or are applicable only when labeled training data is available. This paper presents a novel method called FSNEAT which extends the NEAT neuroevolution method to automatically determine the right set of inputs for the networks it evolves. By learning the network’s inputs, topology, and weights simultaneously, FSNEAT addresses the feature selection problem without relying on metalearning or labeled data. Initial experiments in a line orientation task demonstrate that FSNEAT can learn networks with fewer inputs and better performance than traditional NEAT. Furthermore, it outperforms traditional NEAT even when the feature set does not contain extraneous features because it searches for networks in a lowerdimensional space. 1
A Statistic to Estimate the Variance of the HistogramBased Mutual Information Estimator Based on . . .
, 1999
"... In the case of two signals with independent pairs of observations (x # ,y # ) a statistic to estimate the variance of the histogram based mutual information estimator has been derived earlier. We present such a statistic for dependent pairs. To derive this statistic it is necessary to avail of a ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
In the case of two signals with independent pairs of observations (x # ,y # ) a statistic to estimate the variance of the histogram based mutual information estimator has been derived earlier. We present such a statistic for dependent pairs. To derive this statistic it is necessary to avail of a reliable statistic to estimate the variance of the sample mean in case of dependent observations. We derive and discuss this statistic and a statistic to estimate the variance of the mutual information estimator. These statistics are validated by simulations. # 1999 Elsevier Science B.V. All rights reserved. Zusammenfassung Im Fall zweier Signale mit unabhaK ngigen Paaren von Beobachtungen (x # ,y # ) wurde schon fruK her eine Statistik zur SchaK tzung der Varianz des histogrammbasierten SchaK tzers fuK r die Transinformation (mutual information) abgeleitet. Wir stellen eine solche Statistik fuK r abhaK ngige Paare vor. Um diese Statistik abzuleiten, ist es erforderlich, auf eine zuv...
Feature Selection with Neural Networks
 Behaviormetrika
, 1998
"... Features gathered from the observation of a phenomenon are not all equally informative: some of them may be noisy, correlated or irrelevant. Feature selection aims at selecting a feature set that is relevant for a given task. This problem is complex and remains an important issue in many domains. In ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
Features gathered from the observation of a phenomenon are not all equally informative: some of them may be noisy, correlated or irrelevant. Feature selection aims at selecting a feature set that is relevant for a given task. This problem is complex and remains an important issue in many domains. In the field of neural networks, feature selection has been studied for the last ten years and classical as well as original methods have been employed. This paper is a review of neural network approaches to feature selection. We first briefly introduce baseline statistical methods used in regression and classification. We then describe families of methods which have been developed specifically for neural networks. Representative methods are then compared on different test problems. Keywords Feature Selection, Subset selection, Variable Sensitivity, Sequential Search Sélection de Variables et Réseaux de Neurones Philippe LERAY et Patrick GALLINARI Résumé Les données collectées lors de l'obse...
Automatic Determination of Optimal Network Topologies Based on Information Theory and Evolution
 IEEE Proceedings of the 23rd EUROMICRO Conference
, 1997
"... We present a new approach to determine the optimal topology of multilayer perceptrons for a given learning task based on information theory and evolution. Our method exploits the mutual information of the inputoutput relation to sort the units into a list with respect to their information content. ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We present a new approach to determine the optimal topology of multilayer perceptrons for a given learning task based on information theory and evolution. Our method exploits the mutual information of the inputoutput relation to sort the units into a list with respect to their information content. Embedded in a evolutionary algorithm, a mutation operator is proposed, which removes or adds input units from given networks based on their ranking. On several benchmarks the power of the approach is demonstrated. We conclude that using an evolutionary algorithm as framework in conjunction with intelligent mutation operators is concurrently the most efficient optimization technique with regard to network size and performance as well as scalability. 1 Introduction A nontrivial task in the application of neural networks is the determination of the appropriate level of complexity of the model fitted by the given data. Following the general principle of Occam's Razor, we should choose the sim...
Effective Input Variable Selection for Function Approximation
"... Abstract. Input variable selection is a key preprocess step in any I/O modelling problem. Normally, better generalization performance is obtained when unneeded parameters coming from irrelevant or redundant variables are eliminated. Information theory provides a robust theoretical framework for perf ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
Abstract. Input variable selection is a key preprocess step in any I/O modelling problem. Normally, better generalization performance is obtained when unneeded parameters coming from irrelevant or redundant variables are eliminated. Information theory provides a robust theoretical framework for performing input variable selection thanks to the concept of mutual information. Nevertheless, for continuous variables, it is usually a more difficult task to determine the mutual information between the input variables and the output variable than for classification problems. This paper presents a modified approach for variable selection for continuous variables adapted from a previous approach for classification problems, making use of a mutual information estimator based on the knearest neighbors. 1
Spectrophotometric Variable Selection By Mutual Information
, 2004
"... Spectrophotometric data often comprise a great number of numerical components or variables that can be used in calibration models. When a large number of such variables are incorporated into a particular model, many difficulties arise, and it is often necessary to reduce the number of spectral varia ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Spectrophotometric data often comprise a great number of numerical components or variables that can be used in calibration models. When a large number of such variables are incorporated into a particular model, many difficulties arise, and it is often necessary to reduce the number of spectral variables. This paper proposes an incremental (Forward  Backward) procedure, initiated using an entropybased criterion (mutual information), to choose the first variable. The advantages of the method are discussed; results in quantitative chemical analysis by spectrophotometry show the improvements obtained with respect to traditional and nonlinear calibration models.
Mutual Information Methods for Evaluating Dependence Among Outputs in Learning Machines
, 2001
"... The evaluation of dependence among output errors of multiinput multioutput learning machines can help us in designing wellbehaved systems, highlighting hidden interactions among their internal components that can add noise to the learning process. By estimating the relations between performances ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
The evaluation of dependence among output errors of multiinput multioutput learning machines can help us in designing wellbehaved systems, highlighting hidden interactions among their internal components that can add noise to the learning process. By estimating the relations between performances and dependence among output errors, we can compare different models of learning machines in order to select the ones best suited to a particular problem. We distinguish between dependence among outputs and dependence among output errors and we propose measures based on mutual information for evaluating both these types of dependence. Global measures of dependence between outputs and output errors, together with mutual information error matrices for evaluating specific dependences between each pair of outputs are presented. We propose a statistical test of hyphotesis for evaluating the difference of the dependence among outputs and output errors between different learning machine...