Results 1 -
2 of
2
Irrelevant Features and the Subset Selection Problem
- MACHINE LEARNING: PROCEEDINGS OF THE ELEVENTH INTERNATIONAL
, 1994
"... We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features ..."
Abstract
-
Cited by 515 (22 self)
- Add to MetaCart
We address the problem of finding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the definitions used in the machine learning literature do not adequately partition the features into useful categories of relevance. We present definitions for irrelevance and for two degrees of relevance. These definitions improve our understanding of the behavior of previous subset selection algorithms, and help define the subset of features that should be sought. The features selected should depend not only on the features and the target concept, but also on the induction algorithm. We describe a method for feature subset selection using cross-validation that is applicable to any induction algorithm, and discuss experiments conducted with ID3 and C4.5 on artificial and real datasets.
Inferring Dependencies from Relations: A Conceptual Clustering Approach
- Computational Intelligence
, 1999
"... In this paper we consider two related types of data dependencies that can hold in a relation: conjunctive implication rules between attribute-value pairs, and functional dependencies. We present a conceptual clustering approach that can be used, with some small modifications, for inferring a cover f ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper we consider two related types of data dependencies that can hold in a relation: conjunctive implication rules between attribute-value pairs, and functional dependencies. We present a conceptual clustering approach that can be used, with some small modifications, for inferring a cover for both types of dependencies. The approach consists of two steps. First, a particular clustered representation of the relation, called concept (or Galois) lattice is built; then, a cover is extracted from the lattice built in the earlier step. The main emphasis of this paper is on the second step. We study the computational complexity of the proposed approach and present an experimental comparison with other methods that confirms its validity. The results of the experiments show that our algorithm for extracting implication rules from concept lattices clearly outperforms an earlier algorithm, and suggest that the overall lattice-based approach to inferring functional dependencies from relations can be seen as an alternative to traditional methods.

