## Data Mining with an Ant Colony Optimization Algorithm (2002)

### Cached

### Download Links

- [www.cs.kent.ac.uk]
- [www.cs.ukc.ac.uk]
- [sci2s.ugr.es]
- DBLP

### Other Repositories/Bibliography

Venue: | IEEE Transactions on Evolutionary Computation |

Citations: | 93 - 13 self |

### BibTeX

@ARTICLE{Parpinelli02datamining,

author = {Rafael S. Parpinelli and Heitor S. Lopes and Alex A. Freitas},

title = {Data Mining with an Ant Colony Optimization Algorithm},

journal = {IEEE Transactions on Evolutionary Computation},

year = {2002},

volume = {6},

pages = {321--332}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract – This work proposes an algorithm for data mining called Ant-Miner (Ant Colony-based Data Miner). The goal of Ant-Miner is to extract classification rules from data. The algorithm is inspired by both research on the behavior of real ant colonies and some data mining concepts and principles. We compare the performance of Ant-Miner with CN2, a well-known data mining algorithm for classification, in six public domain data sets. The results provide evidence that: (a) Ant-Miner is competitive with CN2 with respect to predictive accuracy; and (b) The rule lists discovered by Ant-Miner are considerably simpler (smaller) than those discovered by CN2. Index Terms – Ant Colony Optimization, data mining, knowledge discovery, classification. I.

### Citations

9231 |
Elements of Information Theory
- Cover, Thomas
- 1990
(Show Context)
Citation Context ... heuristic function that is an estimate of the quality of this term, with respect to its ability to improve the predictive accuracy of the rule. This heuristic function is based on Information Theory =-=[7]-=-. More precisely, the value of h ij for termij involves a measure of the entropy (or amount of information) associated with that term. For each termij of the form Ai=Vij – where Ai is the i-th attribu... |

1084 |
Swarm Intelligence: From Natural to Artificial Systems
- Bonabeau, MarcoDorigo, et al.
- 1999
(Show Context)
Citation Context ...embers of the colony. However, the tasks performed by different insects are related to each other in such a way that the colony, as a whole, is capable of solving complex problems through cooperation =-=[2]-=-. Important, survival-related problems such as selecting and picking up materials, finding and storing food, which require sophisticated planning, are solved by insect colonies without any kind of sup... |

936 | The Ant System: Optimization by a colony of cooperating agents
- Dorigo, Maniezzo, et al.
- 1996
(Show Context)
Citation Context ...g intuitively comprehensible for the user, as long as the number of discovered rules and the number of terms in rule antecedents are not large. To the best of our knowledge, the use of ACO algorithms =-=[9]-=- [10] [11] for discovering classification rules, in the context of data mining, is a research area still unexplored. Actually, the only ant algorithm developed for data mining that we are aware of is ... |

858 |
C4.5: Programs for
- Quinlan
- 1993
(Show Context)
Citation Context ...ng to termij the highest possible predictive power. The heuristic function used by Ant-Miner, the entropy measure, is the same kind of heuristic function used by decision-tree algorithms such as C4.5 =-=[19]-=-. The main difference between decision trees and Ant-Miner, with respect to the heuristic function, is that in decision trees the entropy is computed for an attribute as a whole, since an entire attri... |

807 | The CN2 induction algorithm
- Clark, Niblett
- 1989
(Show Context)
Citation Context ...ments reported here, the actual number of ants per iteration was around 1500, rather than 3000. C. Comparing Ant-Miner with CN2 We have evaluated the performance of Ant-Miner by comparing it with CN2 =-=[5]-=- [6], a well-known classification-rule discovery algorithm. In essence, CN2 searches for a rule list in an incremental fashion. It discovers one rule at a time. Each time it discovers a rule it adds t... |

715 |
From data mining to knowledge discovery: an overview
- Fayyad, Piatetsky-Shapiro, et al.
- 1996
(Show Context)
Citation Context ...tistics and databases. We emphasize that in data mining – unlike for example in classical statistics – the goal is to discover knowledge that is not only accurate but also comprehensible for the user =-=[12]-=- [13]. Comprehensibility is important whenever discovered knowledge will be used for supporting a decision made by a human user. After all, if discovered knowledge is not comprehensible for the user, ... |

380 |
Computer systems that learn
- Weiss, Kulikowski
- 1991
(Show Context)
Citation Context ... carried out across two criteria, namely the predictive accuracy of the discovered rule lists and their simplicity. Predictive accuracy was measured by a well-known 10-fold cross-validation procedure =-=[24]-=-. In essence, each data set is divided into 10 mutually exclusive and exhaustive partitions and the algorithm is run once for each partition. Each time a different partition is used as the test set an... |

373 | Ant algorithms for discrete optimization
- Dorigo, Caro, et al.
(Show Context)
Citation Context ...lgorithm. Therefore, the first step in designing a data mining algorithm is to define which task the algorithm will address. 1sIn this paper we propose an Ant Colony Optimization (ACO) algorithm [10] =-=[11]-=- for the classification task of data mining. In this task the goal is to assign each case (object, record, or instance) to one class, out of a set of predefined classes, based on the values of some at... |

341 | Rule induction with CN2: Some recent improvements
- Clark, Boswell
- 1991
(Show Context)
Citation Context ...s reported here, the actual number of ants per iteration was around 1500, rather than 3000. C. Comparing Ant-Miner with CN2 We have evaluated the performance of Ant-Miner by comparing it with CN2 [5] =-=[6]-=-, a well-known classification-rule discovery algorithm. In essence, CN2 searches for a rule list in an incremental fashion. It discovers one rule at a time. Each time it discovers a rule it adds that ... |

158 |
A conservation law for generalization performance
- Schaffer
- 1994
(Show Context)
Citation Context ...cerning predictive accuracy. The fact that rule pruning reduces predictive accuracy in some data sets is not surprising. It stems from the fact that rule pruning is a form of inductive bias [20] [21] =-=[22]-=-, and any inductive bias is suitable for some data sets and unsuitable for others. The results concerning the simplicity of the discovered rule list are reported in the third and fourth columns of Tab... |

154 | Generating Production Rules from Decision Trees
- Quinlan
- 1987
(Show Context)
Citation Context ...than a longer one. As soon as the current ant completes the construction of its rule, the rule pruning procedure is called. The strategy for the rule pruning procedure is similar to that suggested by =-=[18]-=-, but the rule quality criteria used in the two procedures are very different. The basic idea is to iteratively remove one-term-at-a-time from the rule while this process improves the quality of the r... |

126 | Overfitting avoidance as bias
- Schaffer
- 1993
(Show Context)
Citation Context ..., concerning predictive accuracy. The fact that rule pruning reduces predictive accuracy in some data sets is not surprising. It stems from the fact that rule pruning is a form of inductive bias [20] =-=[21]-=- [22], and any inductive bias is suitable for some data sets and unsuitable for others. The results concerning the simplicity of the discovered rule list are reported in the third and fourth columns o... |

113 | Error-based and entropy-based discretization of continuous features; in: Proceedings of 2nd international conference on knowledge discovery in databases
- Kohavi, Sahami
- 1996
(Show Context)
Citation Context ...les referring only to categorical attributes. Therefore, continuous attributes have to be discretized in a preprocessing step. This discretization was performed by the C4.5-Disc discretization method =-=[15]-=-. This method simply uses the very well-known C4.5 algorithm [19] for discretizing continuous attributes. In essence, for each attribute to be discretized it is extracted, from the training set, a red... |

76 |
Mining Very Large Database with Parallel Processing
- Freitas, Lavington
- 1998
(Show Context)
Citation Context ...cs and databases. We emphasize that in data mining – unlike for example in classical statistics – the goal is to discover knowledge that is not only accurate but also comprehensible for the user [12] =-=[13]-=-. Comprehensibility is important whenever discovered knowledge will be used for supporting a decision made by a human user. After all, if discovered knowledge is not comprehensible for the user, he/sh... |

56 |
Trading accuracy for simplicity in decision trees
- Bohanec, Bratko
- 1994
(Show Context)
Citation Context ...ve accuracy. Actually, there are several classification-rule discovery algorithms that were explicitly designed to improve rule set simplicity, even at the expense of reducing the predictive accuracy =-=[1]-=- [3] [4]. D. The Effect of Pruning In order to analyze the influence of rule pruning in the overall Ant-Miner algorithm, Ant-Miner was also run without rule pruning. All the other procedures of Ant-Mi... |

54 | Understanding the Crucial Role of Attribute Interaction in Data Mining
- Freitas
- 2001
(Show Context)
Citation Context ...discussion about why evolutionary algorithms tend to cope better with attribute interactions than greedy, local search-based decision tree and rule induction algorithms, the reader is referred to [8] =-=[14]-=-. C. Rule Pruning Rule pruning is a commonplace technique in data mining [3]. As mentioned earlier, the main goal of rule pruning is to remove irrelevant terms that might have been unduly included in ... |

51 | ACO algorithms for the traveling salesman problem
- Stützle, Dorigo
- 1999
(Show Context)
Citation Context ... This process can be described as a loop of positive feedback, in which the probability that an ant chooses a path is proportional to the number of ants that have already passed by that path [9] [11] =-=[23]-=-. When an established path between a food source and the ants’ nest is disturbed by the presence of an object, ants soon will try to go around the obstacle. Firstly, each ant can choose to go around t... |

42 | Simplifying decision trees: A survey
- Breslow, Aha
- 1997
(Show Context)
Citation Context ...te interactions than greedy, local search-based decision tree and rule induction algorithms, the reader is referred to [8] [14]. C. Rule Pruning Rule pruning is a commonplace technique in data mining =-=[3]-=-. As mentioned earlier, the main goal of rule pruning is to remove irrelevant terms that might have been unduly included in the rule. Rule pruning potentially 11sincreases the predictive power of the ... |

41 | For every generalization action, is there really an equal and opposite reaction? Analysis of the conservation law for generalization performance
- Rao, Gordon, et al.
- 1995
(Show Context)
Citation Context ...rmful, concerning predictive accuracy. The fact that rule pruning reduces predictive accuracy in some data sets is not surprising. It stems from the fact that rule pruning is a form of inductive bias =-=[20]-=- [21] [22], and any inductive bias is suitable for some data sets and unsuitable for others. The results concerning the simplicity of the discovered rule list are reported in the third and fourth colu... |

37 |
The ant colony optimization meta-heuristic, New Ideas in Optimization
- Dorigo, Caro
- 1999
(Show Context)
Citation Context ...ing algorithm. Therefore, the first step in designing a data mining algorithm is to define which task the algorithm will address. 1sIn this paper we propose an Ant Colony Optimization (ACO) algorithm =-=[10]-=- [11] for the classification task of data mining. In this task the goal is to assign each case (object, record, or instance) to one class, out of a set of predefined classes, based on the values of so... |

35 | Discovering Interesting Patterns for Investment Decision Making with GLOWER – A Genetic Learner Overlaid With Entropy Reduction. Data Mining and Knowledge Discovery 4(4
- Dhar, Chou, et al.
- 2000
(Show Context)
Citation Context ...ral discussion about why evolutionary algorithms tend to cope better with attribute interactions than greedy, local search-based decision tree and rule induction algorithms, the reader is referred to =-=[8]-=- [14]. C. Rule Pruning Rule pruning is a commonplace technique in data mining [3]. As mentioned earlier, the main goal of rule pruning is to remove irrelevant terms that might have been unduly include... |

19 |
Overpruning large decision trees
- Catlett
- 1991
(Show Context)
Citation Context ...acy. Actually, there are several classification-rule discovery algorithms that were explicitly designed to improve rule set simplicity, even at the expense of reducing the predictive accuracy [1] [3] =-=[4]-=-. D. The Effect of Pruning In order to analyze the influence of rule pruning in the overall Ant-Miner algorithm, Ant-Miner was also run without rule pruning. All the other procedures of Ant-Miner, as ... |

14 |
An evolutionary approach to simulate cognitive feedback learning in medical domain; in: Genetic algorithms and fuzzy logic systems: soft computing perspectives, Word Scientific
- Lopes, Coutinho, et al.
- 1998
(Show Context)
Citation Context ...probability of termij being chosen by other ants in the future in proportion to the quality of the rule. The quality of a rule, denoted by Q, is computed by the formula: Q = sensitivity • specificity =-=[16]-=-, defined as: where: TP TN Q = ⋅ (5) TP + FN FP + TN • TP (true positives) is the number of cases covered by the rule that have the class predicted by the rule. • FP (false positives) is the number of... |

4 |
On data clustering with artificial ants,” In: Data Mining with Evolutionary Algorithms
- Monmarché
- 1999
(Show Context)
Citation Context ...lassification rules, in the context of data mining, is a research area still unexplored. Actually, the only ant algorithm developed for data mining that we are aware of is an algorithm for clustering =-=[17]-=-, which is a data mining task very different from the classification task addressed in this paper. We believe that the development of ACO algorithms for data mining is a promising research area, due t... |