Results 1 - 10
of
150
Wrappers for Feature Subset Selection
- AIJ SPECIAL ISSUE ON RELEVANCE
, 1997
"... In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a ..."
Abstract
-
Cited by 1569 (3 self)
- Add to MetaCart
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach andshow a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and Naive-Bayes.
Irrelevant Features and the Subset Selection Problem
- MACHINE LEARNING: PROCEEDINGS OF THE ELEVENTH INTERNATIONAL
, 1994
"... We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features ..."
Abstract
-
Cited by 757 (26 self)
- Add to MetaCart
We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features into useful categories of relevance. We present definitions for irrelevance and for two degrees of relevance. These definitions improve our understanding of the behavior of previous subset selection algorithms, and help define the subset of features that should be sought. The features selected should depend not only on the features and the target concept, but also on the induction algorithm. We describe a method for feature subset selection using cross-validation that is applicable to any induction algorithm, and discuss experiments conducted with ID3 and C4.5 on artificial and real datasets.
Selection of relevant features and examples in machine learning
- ARTIFICIAL INTELLIGENCE
, 1997
"... In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been mad ..."
Abstract
-
Cited by 606 (2 self)
- Add to MetaCart
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area.
Mining high-speed data streams
, 2000
"... Categories and Subject ���������� � �¨�������������������������¦���¦����������¡¤�� ¡ � ¡����������������¦¡¤����§�£���� ..."
Abstract
-
Cited by 402 (10 self)
- Add to MetaCart
(Show Context)
Categories and Subject ���������� � �¨�������������������������¦���¦����������¡¤�� ¡ � ¡����������������¦¡¤����§�£����
Correlation-based feature selection for machine learning
, 1998
"... A central problem in machine learning is identifying a representative set of features from which to construct a classification model for a particular task. This thesis addresses the problem of feature selection for machine learning through a correlation based approach. The central hypothesis is that ..."
Abstract
-
Cited by 318 (3 self)
- Add to MetaCart
(Show Context)
A central problem in machine learning is identifying a representative set of features from which to construct a classification model for a particular task. This thesis addresses the problem of feature selection for machine learning through a correlation based approach. The central hypothesis is that good feature sets contain features that are highly correlated with the class, yet uncorrelated with each other. A feature evaluation formula, based on ideas from test theory, provides an operational definition of this hypothesis. CFS (Correlation based Feature Selection) is an algorithm that couples this evaluation formula with an appropriate correlation measure and a heuristic search strategy. CFS was evaluated by experiments on artificial and natural datasets. Three machine learning algorithms were used: C4.5 (a decision tree learner), IB1 (an instance based learner), and naive Bayes. Experiments on artificial datasets showed that CFS quickly identifies and screens irrelevant, redundant, and noisy features, and identifies relevant features as long as their relevance does not strongly depend on other features. On natural domains, CFS typically eliminated well over half the features. In most cases, classification accuracy using the reduced feature set equaled or bettered accuracy using the complete feature set.
Feature Selection for Classification
- Intelligent Data Analysis
, 1997
"... Feature selection has been the focus of interest for quite some time and much work has been done. With the creation of huge databases and the consequent requirements for good machine learning techniques, new problems arise and novel approaches to feature selection are in demand. This survey is a com ..."
Abstract
-
Cited by 299 (9 self)
- Add to MetaCart
(Show Context)
Feature selection has been the focus of interest for quite some time and much work has been done. With the creation of huge databases and the consequent requirements for good machine learning techniques, new problems arise and novel approaches to feature selection are in demand. This survey is a comprehensive overview of many existing methods from the 1970's to the present. It identifies four steps of a typical feature selection method, and categorizes the different existing methods in terms of generation procedures and evaluation functions, and reveals hitherto unattempted combinations of generation procedures and evaluation functions. Representative methods are chosen from each category for detailed explanation and discussion via example. Benchmark datasets with different characteristics are used for comparative study. The strengths and weaknesses of different methods are explained. Guidelines for applying feature selection methods are given based on data types and domain characteris...
Toward integrating feature selection algorithms for classification and clustering
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2005
"... This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals ..."
Abstract
-
Cited by 267 (21 self)
- Add to MetaCart
(Show Context)
This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals unattempted combinations, and provides guidelines in selecting feature selection algorithms. With the categorizing framework, we continue our efforts toward building an integrated system for intelligent feature selection. A unifying platform is proposed as an intermediate step. An illustrative example is presented to show how existing feature selection algorithms can be integrated into a meta algorithm that can take advantage of individual algorithms. An added advantage of doing so is to help a user employ a suitable algorithm without knowing details of each algorithm. Some real-world applications are included to demonstrate the use of feature selection in data mining. We conclude this work by identifying trends and challenges of feature selection research and development.
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
- Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract
-
Cited by 224 (1 self)
- Add to MetaCart
(Show Context)
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, tree-structured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
Greedy Attribute Selection
- In Proceedings of the Eleventh International Conference on Machine Learning
, 1994
"... Many real-world domains bless us with a wealth of attributes to use for learning. This blessing is often a curse: most inductive methods generalize worse given too many attributes than if given a good subset of those attributes. We examine this problem for two learning tasks taken from a calendar s ..."
Abstract
-
Cited by 218 (2 self)
- Add to MetaCart
(Show Context)
Many real-world domains bless us with a wealth of attributes to use for learning. This blessing is often a curse: most inductive methods generalize worse given too many attributes than if given a good subset of those attributes. We examine this problem for two learning tasks taken from a calendar scheduling domain. We show that ID3/C4.5 generalizes poorly on these tasks if allowed to use all available attributes. We examine five greedy hillclimbing procedures that search for attribute sets that generalize well with ID3/C4.5. Experiments suggest hillclimbing in attribute space can yield substantial improvements in generalization performance. We present a caching scheme that makes attribute hillclimbing more practical computationally. We also compare the results of hillclimbing in attribute space with FOCUS and RELIEF on the two tasks. 1 INTRODUCTION As machine learning is applied to real-world tasks, difficulties arise that do not occur in simpler textbook experiments. One such diffic...
Locally Weighted Learning for Control
, 1996
"... Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We ex ..."
Abstract
-
Cited by 194 (18 self)
- Add to MetaCart
(Show Context)
Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control.