Results 1 
2 of
2
On the Boosting Ability of TopDown Decision Tree Learning Algorithms
 In Proceedings of the TwentyEighth Annual ACM Symposium on the Theory of Computing
, 1995
"... We analyze the performance of topdown algorithms for decision tree learning, such as those employed by the widely used C4.5 and CART software packages. Our main result is a proof that such algorithms are boosting algorithms. By this we mean that if the functions used to label the internal nodes of ..."
Abstract

Cited by 89 (6 self)
 Add to MetaCart
We analyze the performance of topdown algorithms for decision tree learning, such as those employed by the widely used C4.5 and CART software packages. Our main result is a proof that such algorithms are boosting algorithms. By this we mean that if the functions used to label the internal nodes of the decision tree can weakly approximate the unknown target function, then the topdown algorithms we study will amplify this weak advantage to build a tree achieving any desired level of accuracy. The bounds we obtain for this amplification show an interesting dependence on the splitting criterion function G used by the topdown algorithm. More precisely, if the functions used to label the internal nodes have error 1=2 \Gamma fl as approximations to the target function, then for the splitting criteria used by CART and C4.5, trees of size (1=ffl) O(1=fl 2 ffl 2 ) and (1=ffl) O(log(1=ffl)=fl 2 ) (respectively) suffice to drive the error below ffl. Thus, small constant advantage over...
Applying the Weak Learning Framework to Understand and Improve C4.5
 In Proceedings of the Thirteenth International Conference on Machine Learning
, 1996
"... this paper is to push this interaction further in light of these recent developments. In particular, we perform experiments suggested by the formal results for Adaboost and C4:5 within the weak learning framework. We concentrate on two particularly intriguing issues. First, the theoretical boosting ..."
Abstract

Cited by 47 (5 self)
 Add to MetaCart
this paper is to push this interaction further in light of these recent developments. In particular, we perform experiments suggested by the formal results for Adaboost and C4:5 within the weak learning framework. We concentrate on two particularly intriguing issues. First, the theoretical boosting results for topdown decision tree algorithms such as C4:5 [12] suggest that a new splitting criterion may result in trees that are smaller and more accurate than those obtained using the usual information gain. We confirm this suggestion experimentally. Second, a superficial interpretation of the theoretical results suggests that Adaboost should vastly outperform C4:5. This is not the case in practice, and we argue through experimental results that the theory must be understood in terms of a measure of a boosting algorithm's behavior called its advantage sequence. We compare the advantage sequences for