Pessimistic Decision Tree Pruning Based on Tree Size (1997) [11 citations — 1 self]
Abstract:
In this work we develop a new criteria to perform pessimistic decision tree pruning. Our method is theoretically sound and is based on theoretical concepts such as uniform convergence and the Vapnik-Chervonenkis dimension. We show that our criteria is very well motivated, from the theory side, and performs very well in practice. The accuracy of the new criteria is comparable to that of the current method used in C4.5. 1 Introduction The phenomena of overfitting the data is well known in machine learning, and refers to the case that the learned hypothesis is so closely related to the training examples such that its generalization capabilities would be penalized. Overfitting would usually occur when the class of hypotheses used is as complex as the given training sample. For this reason we would like, in many cases, to limit the hypothesis we generate to be "less complex" than the training sample. In decision trees the overfitting phenomena can occur when the size of the tree is too lar...

