## Decision Trees For Classification: A Review And Some New Results

Citations: | 1 - 0 self |

### BibTeX

@MISC{Kothari_decisiontrees,

author = {Ravi Kothari and Ming Dong},

title = {Decision Trees For Classification: A Review And Some New Results},

year = {}

}

### OpenURL

### Abstract

Introduction Top-down induction of decision trees is a simple and powerful method of inferring classication rules from a set of labeled examples 1 . Each node of the tree implements a decision rule that splits the examples into two or more partitions. New nodes are created to handle each of the partitions and a node is considered terminal or a leaf node based on a stopping criteria. This standard approach to decision tree construction thus corresponds to a top-down greedy algorithm that makes locally optimal decisions at each node. There are two advantages that decision trees have over many other methods of classication methods. The rst is that the sequence of decisions made from the root node to the eventual labeling of a test input is easy to follow. This gives them an intuitive appeal that other methods of classication such as

### Citations

7342 | J.H.: Genetic Algorithms and - Goldberg, Holland - 1988 |

4934 | C4.5: Programs for Machine Learning - Quinlan - 1993 |

3909 |
Classification and Regression Trees
- Breiman, Friedman, et al.
- 1984
(Show Context)
Citation Context ...omparing the obtained values of the frequency of a class because of the split to the a priori frequency of the class 4;5 . More specically, 2 = C X k=1 V X v=1 N (v) !k ~ N (v) !k 2 ~ N (v) !k (3) where, ~ N (v) !k = (N (v) =N)N!k denotes the a priori frequency. Clearly, a larger value of 2 indicates that the split is more homogeneous, i.e., has a greater frequency of instances from a partic... |

3354 | Induction of Decision Trees
- Quinlan
- 1986
(Show Context)
Citation Context ...al is to construct a decision tree which upon construction can \reliably" assign a test input to a specic class. At a particular node in the tree, let there be N training examples represented by,=-= (x (1)-=- ; y (1) ); (x (2) ; y (2) ); : : : ; (x (N) ; y (N) ) where, x (i) is a vector of n attributes and y (i) 2 is the class label corresponding to the input x (i) . Of these N examples, N!k belong to cla... |

2868 | P.: UCI Repository of Machine Learning Databases - Merz, Merphy - 1996 |

2534 | An Introduction to the Bootstrap - Efron, Tibshirani - 1993 |

1966 | Genetic Algorithms + Data Structure = Evolution Programs - Michalewicz - 1992 |

1160 | Modeling by shortest data description - Rissanen - 1983 |

805 | Bootstrap methods: another look at the jackknife - Efron - 1979 |

721 | Cross-Validatory Choices and Assessment of Statistical Prediction (with Discussion - Stone - 1974 |

443 | Computer and Robot Vision - Haralick, Shapiro - 1992 |

311 | Generating Fuzzy Rules by Learning from Examples - Wang, Mendel - 1992 |

167 |
An Empirical Comparison of Selection Measures for Decision Tree Induction
- MINGERS
- 1989
(Show Context)
Citation Context ...ror complexity measure is removed. 3.2 Minimum Error Based Pruning The minimum error based pruning 25 is based on the following equation for computing the expected error rate, Eme = N N!k + C 1 N + C =-=(6)-=- where, N is the number of instances at a given node, and N!k are the number of instances of the dominant class. The expected error rate of the unpruned tree is computed by computing the expected erro... |

166 |
An empirical comparison of pruning methods for decision tree induction
- Mingers
- 1989
(Show Context)
Citation Context ...nces of class ! j class ! k respectively in the v th partition, and f() is an indicator function which is 1 if k x (l) x (m) k r. The overall class co-occurrence matrix is simply, A = V X v=1 A (v) (8=-=-=-) In the preferred case A would become strongly diagonal. With increasing confusion A becomes less and less diagonally dominant. The classiability can then be expressed as, L = C X i=1 A ii C X i=1 C ... |

105 |
Learning internal representations by back-propagating errors
- Rumelhart, Hinton, et al.
- 1986
(Show Context)
Citation Context ... a decision tree which upon construction can \reliably" assign a test input to a specic class. At a particular node in the tree, let there be N training examples represented by, (x (1) ; y (1) );=-= (x (2)-=- ; y (2) ); : : : ; (x (N) ; y (N) ) where, x (i) is a vector of n attributes and y (i) 2 is the class label corresponding to the input x (i) . Of these N examples, N!k belong to class ! k . P k N!k =... |

97 |
A further comparison of splitting rules for decision-tree induction
- Buntine, Niblett
- 1992
(Show Context)
Citation Context ...ithin a circular neighborhood of radius r of an instance of class ! j , kothariLNPR: submitted to World Scientic on June 30, 2000 9 i.e., A (v) jk = N (v) ! j X l=1 N (v) ! k X m=1 f(x (l) ; x (m) ) (=-=-=-7) where, x (l) and x (m) are instances of class ! j class ! k respectively in the v th partition, and f() is an indicator function which is 1 if k x (l) x (m) k r. The overall class co-occurrence mat... |

96 | The attribute selection problem in decision tree generation - Fayyad, Irani - 1992 |

85 | Oversearching and layered search in empirical learning - Quinlan, Cameron-Jones - 1995 |

55 | Learning Decision Rules in Noisy Domains - Niblett, Bratko - 1987 |

52 | Lookahead and pathology in decision tree induction,in - Murthy, Salzberg - 1995 |

43 | Pruning algorithms for rule learning - Fürnkranz - 1997 |

32 | An exact probability metric for decision tree splitting and stopping - Martin - 1997 |

18 |
Experience in the use of an inductive system in knowledge eng
- Hart
- 1984
(Show Context)
Citation Context ...ove approximates the 2 distribution. 2.3 GINI Index of Diversity Based Node Splitting The GINI index of diversity is based on, D(x j ) = 1 N 2 4 C X k=1 V X v=1 N (v) !k 2 N (v) C X k=1 N!k 2 N 3 5 (=-=4) Typi-=-cally we would like a node to be \pure", i.e. have instances of a single class. Similar to the decrease in entropy (or gain in information) used in the information gain based node splitting metho... |

17 | A combined non-parametric approach to feature selection and binary decision tree design - Rounds - 1980 |

11 | S.: Improving Greedy Algorithms by Lookahead-Search - Sarkar, Chakrabarti, et al. - 1994 |

7 | An investigation on the conditions of pruning an induced decision tree - Kim, Koehler - 1994 |

7 | Continuous ID3 Algorithm with Fuzzy entropy measures - Cios, M - 1996 |

5 | Heuristic search for model structure - Elder - 1995 |

5 | An iterative growing and pruning algorithm for classi tree design - Gelfand, Ravishankar, et al. - 1991 |

4 |
On the optimization of fuzzy decision trees. Fuzzy Sets and Systems
- Wang, Chen, et al.
(Show Context)
Citation Context ... j A ij (9) Assuming a linear discriminant is used as the decision rule at each node, the algorithm for texture based look-ahead for decision tree induction can then be based on maximizing, J = G+ L (=-=1-=-0) where, G is similar to that dened in Eq. (1) with the exception that since a linear discriminant is proposed, G is not a function of a particular attribute. in Eq. (10) is a Lagrange parameter and... |

4 | A survey of decision tree classi methodology - Safavian, Landgrebe - 1991 |

3 |
Expert systems - experiments with rule induction
- Mingers
- 1986
(Show Context)
Citation Context ...sub-trees which have minimal impact on the measure. 3.1 Error Complexity Based Pruning One popular method of pruning is the so-called error complexity based pruning 3 . It is based on, E er = ~ R R L =-=(5)-=- where, R denotes the error rate (probability of error) of the unpruned tree, ~ R is the error rate after a node is removed, and L is the number of leaf nodes in the sub-tree of the node being evaluat... |

2 | A growth algorithm for neural networks - Golea, Marchand - 1990 |

2 | Prediction error: The bias/variance decomposition, methods of minimization, and estimation - Kothari - 2000 |

1 | Tree Classi With Multilayer Perceptron Feature Extraction - Gelfand, Guo - 1991 |

1 | Theory and practise of decision tree induction - Kim, Koehler |

1 | Don't care values in induction - Diamantidis, Giakoumakis - 1996 |