Results 1 -
2 of
2
Very simple classification rules perform well on most commonly used datasets
- Machine Learning
, 1993
"... The classification rules induced by machine learning systems are judged by two criteria: their classification accuracy on an independent test set (henceforth "accuracy"), and their complexity. The relationship between these two criteria is, of course, of keen interest to the machine learni ..."
Abstract
-
Cited by 386 (9 self)
- Add to MetaCart
The classification rules induced by machine learning systems are judged by two criteria: their classification accuracy on an independent test set (henceforth "accuracy"), and their complexity. The relationship between these two criteria is, of course, of keen interest to the machine learning community. There are in the literature some indications that very simple rules may achieve surprisingly high accuracy on many datasets. For example, Rendell occasionally remarks that many real world datasets have "few peaks (often just one) " and so are "easy to learn" (Rendell & Seshu, 1990, p.256). Similarly, Shavlik et al. (1991) report that, with certain qualifications, "the accuracy of the perceptron is hardly distinguishable from the more complicated learning algorithms " (p.134). Further evidence is provided by studies of pruning methods (e.g. Buntine & Niblett, 1992; Clark & Niblett, 1989; Mingers, 1989), where accuracy is rarely seen to decrease as pruning becomes more severe (for example, see Table 1) 1. This is so even when rules are pruned to the extreme, as happened with the "Err-comp " pruning method in Mingers (1989). This method produced the most accurate decision trees, and in four of the five domains studied these trees had only 2 or 3 leaves
Very Simple Classification Rules Perform Well
- Machine Learning
, 1993
"... This paper reports the results of experiments measuring the performance of very simple rules on the datasets commonly used in machine learning research. The specific kind of rules examined in this paper,called "1-rules", are rules that classify an object on the basis of a single attribute (i.e. they ..."
Abstract
- Add to MetaCart
This paper reports the results of experiments measuring the performance of very simple rules on the datasets commonly used in machine learning research. The specific kind of rules examined in this paper,called "1-rules", are rules that classify an object on the basis of a single attribute (i.e. theyare 1-leveldecision trees). Section 2 describes a system, called 1R, whose input is a set of training examples and whose output is a 1-rule. In an experimental comparison involving 16 commonly used datasets, 1R'srules are only afew percentage points less accurate, on most of the datasets, than the decision trees produced by C4 (Quinlan, 1986). Section 3 examines possible improvements to 1R's criterion for selecting rules. It defines an upper bound, called 1R*, on the accuracythat such improvements can produce. 1R* turns out to be very similar to the accuracyofC4's decision trees. This result has twoimplications. First, it indicates that simple modifications to 1R might produce a system competitive with C4, although more fundamental modifications are required in order to outperform C4. Second, this result suggests that it may be possible to use the performance of 1-rules to predict the performance of the more complexhypotheses produced by standard learning systems

