## Feature Minimization within Decision Trees (1996)

Venue: | Computational Optimization and Applications |

Citations: | 14 - 2 self |

### BibTeX

@ARTICLE{Bredensteiner96featureminimization,

author = {Erin J. Bredensteiner and Kristin P. Bennett},

title = {Feature Minimization within Decision Trees},

journal = {Computational Optimization and Applications},

year = {1996},

volume = {10},

pages = {10--111}

}

### OpenURL

### Abstract

Decision trees for classification can be constructed using mathematical programming. Within decision tree algorithms, the feature minimization problem is to construct accurate decisions using as few features or attributes within each decision as possible. Feature minimization is an important aspect of data mining since it helps identify what attributes are important and helps produce accurate and interpretable decision trees. In feature minimization with bounded accuracy, we minimize the number of features using a given misclassification error tolerance. This problem can be formulated as a parametric bilinear program and is shown to be NP-complete. A parametric FrankWolfe method is used to solve the bilinear subproblems. The resulting minimization algorithm produces more compact, accurate, and interpretable trees. This procedure can be applied to many di#erent error functions. Formulations and results for two error functions are given. One method, FM RLP-P, dramatically reduced the number of features of one dataset from 147 to 2 while maintaining an 83.6% testing accuracy. Computational results compare favorably with the standard univariate decision tree method, C4.5, as well as with linear programming methods of tree construction. Key Words: Data mining, machine learning, feature minimization, decision trees, bilinear programming. # Knowledge Discovery and Data Mining Group, Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180. Email bredee@rpi.edu, bennek@rpi.edu. Telephone (518) 276-6899. FAX (518) 276-4824. This material is based on research supported by National Science Foundation Grant 949427. 1