## Pruning Algorithms for Rule Learning (1997)

Citations: | 43 - 15 self |

### BibTeX

@MISC{Fürnkranz97pruningalgorithms,

author = {Johannes Fürnkranz},

title = { Pruning Algorithms for Rule Learning},

year = {1997}

}

### Years of Citing Articles

### OpenURL

### Abstract

Pre-pruning and Post-pruning are two standard techniques for handling noise in decision tree learning. Pre-pruning deals with noise during learning, while post-pruning addresses this problem after an overfitting theory has been learned. We first review several adaptations of pre- and post-pruning techniques for separate-and-conquer rule learning algorithms and discuss some fundamental problems. The primary goal of this paper is to show how to solve these problems with two new algorithms that combine and integrate pre- and post-pruning.

### Citations

4934 |
C4.5: Programs for Machine Learning
- Quinlan
- 1993
(Show Context)
Citation Context ...fference between the maximum and the minimum accuracy encountered), and the run-time of the algorithm. The results of C4.5, a decision tree learning system with extensive noise-handling capabilities (=-=Quinlan 1993-=-), are taken from the experiments performed in (Holte 1993) and are meant as an indicator of the performance of state-of-the-art decision tree learning algorithms on these data sets. A short look show... |

3909 | Classification and Regression Trees - Breiman, Friedman, et al. - 1984 |

3354 | Induction of Decision Trees
- Quinlan
- 1986
(Show Context)
Citation Context ...der to increase its predictive accuracy on unseen data. Post-pruning approaches have been commonly used in the decision tree learning algorithms CART (Breiman, Friedman, Olshen, and Stone 1984), ID3 (=-=Quinlan 1987-=-) and ASSISTANT (Niblett and Bratko 1986). An overview and comparison of various approaches can be found in (Mingers 1989) and (Esposito, Malerba, and Semeraro 1993). 4.1 Reduced Error Pruning The mos... |

1160 |
Modeling by shortest data description
- Rissanen
- 1983
(Show Context)
Citation Context ...ommonly used among them are ffl Encoding Length Restriction: This heuristic used in the Inductive Logic Programming algorithm Foil (Quinlan 1990) is based on the Minimum Description Length principle (=-=Rissanen 1978-=-). It tries to avoid learning complicated rules that cover only a few examples by making sure that the number of bits that are needed to encode the current clause is less than the number of bits neede... |

976 | Fast effective rule induction
- Cohen
- 1995
(Show Context)
Citation Context ... is entirely independent from the basic learning algorithm. Other pruning and stopping criteria can further improve the performance and eliminate weaknesses. For instance, it has been pointed out in (=-=Cohen 1995-=-) that accuracy estimates for low-coverage rules will have a high variance and therefore I-REP is likely to stop prematurely and to over-generalize in domains that are susceptible to the Small Disjunc... |

853 | Leaming Logical Definitions from Relations
- Quinlan
- 1990
(Show Context)
Citation Context ...Learning, Inductive Logic Programming 1 Introduction Separate-and-conquer rule-learning systems have gained in popularity through the recent success of the Inductive Logic Programming algorithm Foil (=-=Quinlan 1990-=-). We will analyze different pruning methods for this type of inductive rule learning algorithm and discuss some of their problems. The main contribution of this paper are two new algorithms: Top-Down... |

747 | The CN2 induction algorithm - Clark, Niblett - 1989 |

438 | Very simple classification rules perform well on most commonly used datasets
- Holte
- 1993
(Show Context)
Citation Context ... Data Sets We have also experimented with data sets from the UCI repository of Machine Learning databases that have previously been used to compare propositional learning algorithms. The appendix of (=-=Holte 1993-=-) gives a summary of the results achieved by various algorithms on some of the most commonly used data sets of the UCI repository and a short description of these sets. We selected 9 of them for our e... |

324 | Rule induction with CN2: Some recent improvements - Clark, Boswell - 1991 |

317 | The MultiPurpose Incremental Learning System AQ15 and its Testing Applications to Three Medical Domains - Michalski, Mozetic, et al. - 1986 |

308 |
Learning Efficient Classification Procedures and their ,ipplication to Chess Endgame
- Quinlan
(Show Context)
Citation Context ...chalski 1980; Michalski, Mozetic, Hong, and Lavrac 1986). CN2 (Clark and Niblett 1989; Clark and Boswell 1991) combined AQ's covering strategy with the greedy information-based test selection of ID3 (=-=Quinlan 1983-=-), which yielded a powerful rule learning algorithm. The term separate-and-conquer has been coined in (Pagallo and Haussler 1990) in the context of learning decision lists. Finally, separate-and-conqu... |

221 | Learning from noisy examples - Angluin, Laird - 1988 |

202 | Boolean feature discovery in empirical learning - Pagallo, Haussler - 1990 |

166 |
An empirical comparison of pruning methods for decision tree induction
- Mingers
- 1989
(Show Context)
Citation Context ...ree learning algorithms CART (Breiman, Friedman, Olshen, and Stone 1984), ID3 (Quinlan 1987) and ASSISTANT (Niblett and Bratko 1986). An overview and comparison of various approaches can be found in (=-=Mingers 1989-=-) and (Esposito, Malerba, and Semeraro 1993). 4.1 Reduced Error Pruning The most common among these methods is Reduced Error Pruning (REP). This simple algorithm has been adapted from decision tree le... |

143 | Concept learning and the problem of small disjuncts - Holte, Acker, et al. - 1989 |

122 | Incremental reduced error pruning - Fürnkranz, Widmer - 1994 |

119 | Overfitting avoidance as bias - Schaffer - 1993 |

71 | The application of inductive logic programming to finite element mesh design - Dolaak, Muggleton - 1992 |

68 | Learning nonrecursive definitions of relations with LINUS - Lavrač, Dˇzeroski, et al. - 1991 |

62 | An Investigation of Noise-Tolerant Relational Concept Leaming Algorithms - Brunk, Pazzani - 1991 |

61 | Induction of logic programs: Foil and related systems - Quinlan, Cameron-Jones - 1995 |

55 | Learning Decision Rules in Noisy Domains - Niblett, Bratko - 1987 |

52 | An experimental comparison of human and machine learning formalisms - Muggleton, Bain, et al. - 1989 |

51 | Efficient Pruning Methods for Separate-and-Conquer Rule Learning Systems,‖ IJCAI93
- Cohen
- 1993
(Show Context)
Citation Context ...ns is commonly evaluated on a separate set of training examples that have not been seen during learning. Post-pruning algorithms include Reduced Error Pruning (REP) (Brunk and Pazzani 1991) and Grow (=-=Cohen 1993-=-). Both have been shown to be very effective in noise-handling. However, they are also inefficient, because they waste time by learning an overfitting concept description and subsequently pruning a si... |

33 | On overfitting avoidance as bias - Wolpert |

30 | Reduced Complexity Rule Induction - Weiss, Indurkhya - 1991 |

24 | Decision tree pruning as a search in the state space - Esposito, Malerba, et al. - 1993 |

23 | fossil: a robust relational learner - Furnkranz - 1993 |

13 |
The minimum description length principle and categorical theories
- Quinlan
- 1994
(Show Context)
Citation Context ...gn is the selection of an optimal number of finite elements on the edges of the structure. Several authors have tried ILP methods on this problem (Dolsak and Muggleton 1992; Dzeroski and Bratko 1992; =-=Quinlan 1994-=-). The available background knowledge consists of an attribute-based description of the edges and of topological relations between the edges. The setup of our experiments was the same as in (Quinlan 1... |

13 | Bratko I: Handling noise in Inductive Logic Programming - Dˇzeroski |

12 |
Pattern Recognition as Rule-Guided Inference
- Michalski
- 1980
(Show Context)
Citation Context ... algorithms try to construct rules with the so-called separate-and-conquer strategy. This method has its roots in the early days of Machine Learning in the covering algorithm of the famous AQ family (=-=Michalski 1980-=-; Michalski, Mozetic, Hong, and Lavrac 1986). CN2 (Clark and Niblett 1989; Clark and Boswell 1991) combined AQ's covering strategy with the greedy information-based test selection of ID3 (Quinlan 1983... |

7 | Top-Down Pruning in Relational Learning - Fürnkranz - 1994 |

4 |
The complexity of Cohen's Grow method. Unpublished manuscript
- Cameron-Jones
- 1994
(Show Context)
Citation Context ...xperiments in (Cohen 1993) show. However, the asymptotic time complexity of the Grow post-pruning method is still above the complexity of the initial rule growing phase as has recently been shown in (=-=Cameron-Jones 1994-=-). The explanation for the speedup that can be gained with the top-down strategy is that it starts from the empty theory, which in many noisy domains is much closer to the final theory than the overfi... |

4 |
Planung und statistische Auswertung von Experimenten (8th ed
- Mittenecker
- 1977
(Show Context)
Citation Context ... outperformed (5%, sometimes 1%) all other algorithms 9 We have used a range test which can be used to quickly determine significant differences between medium values for small (N ! 20) sample sizes (=-=Mittenecker 1977-=-). For N = 10 the value of L =s1 \Gamma 2 R 1 +R 2 has to be ? 0:152 for a significance level of 5% and ? 0:210 for a significance level of 1%. ( i are medium values and R i are ranges. Both can be fo... |

4 | First order learning, zeroth order data - Cameron-Jones, Quinlan - 1993 |

4 | A tight integration of pruning and learning (extended abstract - Fürnkranz - 1995 |

2 | The application of Inductive Logic Programming to finite-element mesh design - sak, B - 1992 |

2 | Handling noise in inductive logic programming - unknown authors - 1991 |

2 | Fossil: A robust relational learner - urnkranz, J - 1994 |

2 | Top-down pruning in relational learning - urnkranz, J - 1994 |

2 | Incremental Reduced Error Pruning - urnkranz, J - 1994 |

1 | Finite element mesh design: An engineering domain for ILP application - sak, B - 1994 |

1 | A tight integration of pruning and learning - urnkranz, J - 1995 |

1 | A tight integration of pruning and learning (extended abstract - urnkranz, J - 1995 |

1 | The complexity of batch approaches to reduced error rule set induction - Cameron-Jones - 1996 |

1 | A tight integration of pruning and learning (Technical Report OEFAI-TR-95-03 - Fürnkranz - 1995 |

1 | Separate-and-conquer rule learning (Technical Report OEFAI-TR-96-25 - Fürnkranz - 1996 |