MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Wrappers for Feature Subset Selection (1997) [592 citations — 3 self]

Abstract:

In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach and show a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and Naive-Bayes. 1 Introduction A univers...

Citations

5180 Genetic Algorithms – Goldberg - 1989
3011 Pattern Classification and Scene Analysis – Duda, Hart - 1973
2573 Classification and Regression Trees – Breiman, Friedman, et al. - 1984
2526 Induction of decision trees – Quinlan - 1986
2227 UCI repository of machine learning databases – Blake, Merz
2210 Artificial Intelligence: A Modern Approach – Russell, Norvig - 1995
1921 Genetic Programming I : On the Programming of Computers by Means of Natural Selection – Koza - 1992
1565 Bagging predictors – Breiman - 1996
1205 Schapire, “Decision-theoretic generalization of on-line learning and application to boosting – Freund, E - 1997
787 Instance-based Learning Algorithms – Aha, Kibler, et al. - 1991
781 Probability inequalities for sums of bounded random variables – Hoeffding - 1963
638 UCI repository of machine learning databases. For information contact ml-repository@ics.uci.edu – Murphy, Aha - 1994
538 C4.5: Programs for – Quinlan - 1993
508 Neural networks and the bias/variance dilemma – Geman, Bienenstock, et al. - 1992
499 Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm – Littlestone - 1988
477 Irrelevant features and the subset selection problem – John, Kohavi, et al. - 1994
457 The strength of weak learnability – Schapire - 1990
438 The weighted majority algorithm – Littlestone, Warmuth - 1994
427 The perceptron: A probabilistic model for information storage and organization in the brain – Rosenblatt - 1958
412 Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology – Holland - 1992
392 Perceptrons: an introduction computational geomery – Minsky, Papert - 1969
366 A study of cross-validation and bootstrap for accuracy estimation and model selection – Kohavi - 1995
322 Neural network ensembles, cross validation, and active learning – Krogh, Vedelsby - 1995
304 Supervised and unsupervised discretization of continuous features – Dougherty, Kohavi, et al. - 1995
294 Boosting a Weak Learning Algorithm by Majority – Freund - 1995
265 Rough sets – Pawlak - 1982
247 Learning in embedded systems – Kaelbling - 1993
242 An analysis of Bayesian classifiers – Langley, Iba, et al. - 1992
234 Beyond independence: Conditions for the optimality of the simple Bayesian classifier – Domingos, Pazzani - 1996
228 How to use expert advice – Cesa-Bianchi, Freund, et al. - 1997
217 L.: A practical approach to feature selection – Kira, Rendell - 1992
202 Solving time-dependent planning problems – Boddy, Dean - 1989
196 Estimating attributes: Analysis and extensions of Relief – Kononenko - 1994
180 The feature selection problem: traditional methods and new algorithm – Kira, Rendell - 1992
175 Training a 3-node neural network is NPcomplete – Blum
169 Learning with many irrelevant features – Almuallim, Dietterich - 1991
163 Greedy attribute selection – Caruana, Freitag - 1994
163 Induction of selective Bayesian classifiers – Langley, Sage - 1994
157 Models of incremental concept formation – GENNARI, LANGLEY, et al. - 1989
150 The MONK's problems - a performance comparison of different learning algorithm – Thrun - 1991
139 A branch and bound algorithm for feature subset selection – Narendra, Fukunaga - 1977
134 Estimating probabilities: A crucial task in machine learning – Cestnik - 1990
131 Bias plus variance decomposition for zeroone loss functions – Kohavi, Wolpert - 1996
130 Constructing optimal binary decision trees is NP-complete – l, Rivest - 1976
128 Data mining using MLC++: A machine learning library – Kohavi, eld, et al. - 1996
121 Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms – Aha - 1992
116 Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms – Skalak - 1994
115 The Estimation of Probabilities: An Essay on Modern Bayesian Methods – Good - 1965
105 Ecient Algorithms for Minimizing Cross Validation Error – Moore, Lee - 1994
96 Learning Classification Trees – Buntine - 1992