Results 1 -
4 of
4
Genetic Programming for Data Classification: Partitioning the Search Space
- In Proceedings of the 2004 Symposium on applied computing (ACM SAC’04
, 2004
"... When Genetic Programming is used to evolve decision trees for data classification, search spaces tend to become extremely large. We present several methods using techniques from the field of machine learning to refine and thereby reduce the search space sizes for decision tree evolvers. We will show ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
When Genetic Programming is used to evolve decision trees for data classification, search spaces tend to become extremely large. We present several methods using techniques from the field of machine learning to refine and thereby reduce the search space sizes for decision tree evolvers. We will show that these refinement methods improve the classification performance of our algorithms.
Detecting and Pruning Introns for Faster Decision Tree Evolution
"... Abstract. We show how the understandability and speed of genetic programming classification algorithms can be improved, without affecting the classification accuracy. By analyzing the decision trees evolved we can remove the unessential parts, called introns, from the discovered decision trees. Sinc ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. We show how the understandability and speed of genetic programming classification algorithms can be improved, without affecting the classification accuracy. By analyzing the decision trees evolved we can remove the unessential parts, called introns, from the discovered decision trees. Since the resulting trees contain only useful information they are smaller and easier to understand. Moreover, by using these pruned decision trees in a fitness cache we can significantly reduce the number of unnecessary fitness calculations. 1
Evolving Fuzzy Decision Trees for Data Classification
"... representation which is similar to the standard decision tree representation used by algorithms like C4.5. The function set (internal nodes) of a full atomic tree consists of atoms. Each atom is syntactically a predicate of the form (variable i operator value), where operator is a compare operator ( ..."
Abstract
- Add to MetaCart
representation which is similar to the standard decision tree representation used by algorithms like C4.5. The function set (internal nodes) of a full atomic tree consists of atoms. Each atom is syntactically a predicate of the form (variable i operator value), where operator is a compare operator (e.g., and > for continuous attributes, = for nominal or Boolean attributes). In the leaf nodes we have a class assignment of the form (class := C), where C is a category selected from the domain of the variable to be predicted. A full atomic tree classi es an instance I by traversing the tree from root to leaf node. In each non-leaf node an atom is evaluated. If the result is true the right branch is traversed, else the left branch is taken. This is done for all internal nodes until a leaf node containing a class assignment node is reached resulting in the classi cation of the instance. In our fuzzy gp system the numerical valued attributes are clustered into a speci ed number of clust
Inducing Diverse Decision Forests with Genetic Programming
, 2005
"... This paper presents an algorithm for induction of ensembles of decision trees, also referred to as decision forests. In order to achieve high expressiveness the trees induced are multivariate, with various, possibly user-defined tests in their internal nodes. Strongly typed genetic programming is ut ..."
Abstract
- Add to MetaCart
This paper presents an algorithm for induction of ensembles of decision trees, also referred to as decision forests. In order to achieve high expressiveness the trees induced are multivariate, with various, possibly user-defined tests in their internal nodes. Strongly typed genetic programming is utilized to evolve structure of the tests. Special attention is given to the problem of diversity of the forest constructed. An approach is proposed, which explicitly encourages the induction algorithm to produce a different tree each run, which represents an alternative description of the data. It is shown that forests constructed this way have significantly reduced classification error even for small forest size, compared to other ensemble methods. Classification accuracy is also compared to other recent methods on several real-world datasets.

