## A Survey of Evolutionary Algorithms for Clustering

Citations: | 6 - 0 self |

### BibTeX

@MISC{Hruschka_asurvey,

author = {Eduardo R. Hruschka and Ricardo J. G. B. Campello and Alex A. Freitas},

title = {A Survey of Evolutionary Algorithms for Clustering},

year = {}

}

### OpenURL

### Abstract

Abstract — This paper presents a survey of evolutionary algorithms designed for clustering tasks. It tries to reflect the profile of this area by focusing more on those subjects that have been given more importance in the literature. In this context, most of the paper is devoted to partitional algorithms that look for hard clusterings of data, though overlapping (i.e., soft and fuzzy) approaches are also covered in the manuscript. The paper is original in what concerns two main aspects. First, it provides an up-to-date overview that is fully devoted to evolutionary algorithms for clustering, is not limited to any particular kind of evolutionary approach, and comprises advanced topics, like multi-objective and ensemble-based evolutionary clustering. Second, it provides a taxonomy that highlights some very important aspects in the context of evolutionary data clustering, namely, fixed or variable number of clusters, cluster-oriented or non-oriented operators, context-sensitive or context-insensitive operators, guided or unguided operators, binary, integer or real encodings, centroid-based, medoid-based, label-based, tree-based or graph-based representations, among others. A number of references is provided that describe applications of evolutionary algorithms for clustering in different domains, such as image processing, computer security, and bioinformatics. The paper ends by addressing some important issues and open questions that can be subject of future research. Index Terms — evolutionary algorithms, clustering, applications. I.

### Citations

8835 | Introduction to algorithms - Cormen, Leiserson, et al. - 1990 |

8606 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...tackling the corresponding clustering problem. Alternatively, the reader may think about using conventional clustering algorithms for fixed k, such as k-means [101][72], EM (Expectation Maximization) =-=[34]-=-[61], and SOM (Self-Organized Maps) [17][62] algorithms. However, these prototype-based algorithms are quite sensitive to initialization of prototypes 1 and may get stuck at sub-optimal solutions. Thi... |

7797 |
Genetic Algorithms
- Goldberg
- 1989
(Show Context)
Citation Context ...a and Bezdek [85] make use of this encoding approach, which allows the evolutionary search to be performed by means of those classical GA operators originally developed to manipulate binary genotypes =-=[54]-=-[105]. However, the use of such classical operators usually suffers from serious drawbacks in the specific context of evolutionary clustering, as will be further discussed in Section II.A.2.a. There i... |

3944 |
Neural Networks – A Comprehensive Foundation, Upper Saddle
- Haykin
- 1998
(Show Context)
Citation Context ...m. Alternatively, the reader may think about using conventional clustering algorithms for fixed k, such as k-means [101][72], EM (Expectation Maximization) [34][61], and SOM (Self-Organized Maps) [17]=-=[62]-=- algorithms. However, these prototype-based algorithms are quite sensitive to initialization of prototypes 1 and may get stuck at sub-optimal solutions. This is a wellknown problem, which becomes more... |

2268 |
The Art of Computer Programming
- Knuth
- 1998
(Show Context)
Citation Context ...evolutionary algorithms and will be freely interchanged in this paper. 4 Actually, the nearest neighbor search can be performed in asymptotic logarithmic time by exploiting the Delaunay triangulation =-=[81]-=-, which is the dual of the Voronoi diagram – e.g., see [98]. However, to the best of our only for data sets with many attributes. When the number of attributes n is not large, the advantage of the mat... |

2251 |
Dubes. Algorithms for Clustering Data
- Jain, C
- 1988
(Show Context)
Citation Context ...egories (clusters) to describe a data set according to similarities among its objects [75][40]. The applicability of clustering is manifold, ranging from market segmentation [17] and image processing =-=[72]-=- through document categorization and web mining [102]. An application field that has shown to be particularly promising for clustering techniques is bioinformatics [7][13][129]. Indeed, the importance... |

1568 |
An Introduction to Genetic Algorithms
- Mitchell
- 1996
(Show Context)
Citation Context ...d Bezdek [85] make use of this encoding approach, which allows the evolutionary search to be performed by means of those classical GA operators originally developed to manipulate binary genotypes [54]=-=[105]-=-. However, the use of such classical operators usually suffers from serious drawbacks in the specific context of evolutionary clustering, as will be further discussed in Section II.A.2.a. There is an ... |

1437 |
Finding Groups in Data: An Introduction to Cluster Analysis
- Kaufman, Rousseeuw
- 1990
(Show Context)
Citation Context ...hms, clustering, applications. I. INTRODUCTION Clustering is a task whose goal is to determine a finite set of categories (clusters) to describe a data set according to similarities among its objects =-=[75]-=-[40]. The applicability of clustering is manifold, ranging from market segmentation [17] and image processing [72] through document categorization and web mining [102]. An application field that has s... |

1341 |
Pattern Recognition with Fuzzy Objective Function Algorithms
- Bezdek
- 1981
(Show Context)
Citation Context ...stering is broad in scope and includes areas such as pattern classification, image segmentation, document categorization, data visualization, and dynamic systems identification, just to mention a few =-=[16]-=-[64][5][32]. Most of the research on evolutionary algorithms for overlapping clustering has focused on algorithms that evolve fuzzy partitions of data. In this context, many authors have proposed evol... |

1181 | A densitybased algorithm for discovering clusters in large spatial databases with noise
- Ester, Kriegel, et al.
- 1996
(Show Context)
Citation Context ...ion corresponding to a cluster) that are separated by low-density regions. Density-based clustering methods usually have the advantage of being flexible enough to discover clusters of arbitrary shape =-=[38]-=-. An evolutionary algorithm – more precisely, an Estimation of Distribution Algorithm – using a density-based fitness function is described in [31]. In this algorithm, the fitness function is essentia... |

1165 |
Multi-Objective Optimization Using Evolutionary Algorithm
- Deb
- 2001
(Show Context)
Citation Context ...s to different objectives, which often requires many runs of the algorithm to try to "optimize" the weight values, etc. These drawbacks are extensively discussed in the literature – see e.g. [22] and =-=[33]-=-. A more principled solution consists of developing a truly multi-objective evolutionary algorithm for clustering, i.e. an algorithm with the main characteristic of using a multiobjective function fol... |

946 |
The Elements of
- Hastie, Tibshirani, et al.
- 2002
(Show Context)
Citation Context ...ling the corresponding clustering problem. Alternatively, the reader may think about using conventional clustering algorithms for fixed k, such as k-means [101][72], EM (Expectation Maximization) [34]=-=[61]-=-, and SOM (Self-Organized Maps) [17][62] algorithms. However, these prototype-based algorithms are quite sensitive to initialization of prototypes 1 and may get stuck at sub-optimal solutions. This is... |

638 | Tabu Search
- Glover
(Show Context)
Citation Context ...solutions by standard processes of reproduction and variation, and an external population that exploits good solutions by elitism. Pan and Cheng [113] adopt a selection procedure based on Tabu search =-=[53]-=-. As previously mentioned in Section II.A.4, the advantages and disadvantages of traditional selection mechanisms are well-known in the evolutionary computation literature and, as far as we know, ther... |

555 |
Evolutionary Algorithms for Solving Multi-Objective Problems
- Coello, Veldhuizen, et al.
(Show Context)
Citation Context ...ght values to different objectives, which often requires many runs of the algorithm to try to "optimize" the weight values, etc. These drawbacks are extensively discussed in the literature – see e.g. =-=[22]-=- and [33]. A more principled solution consists of developing a truly multi-objective evolutionary algorithm for clustering, i.e. an algorithm with the main characteristic of using a multiobjective fun... |

421 | Cluster ensembles - A knowledge reuse framework for combining multiple partitions
- Strehl, Ghosh
- 2002
(Show Context)
Citation Context ...ion tries to keep together objects found together in most of the individual partitions [49]. The graph-based functions look for a consensus partition using partitioning techniques employed for graphs =-=[124]-=-. The functions based on mutual information maximize the mutual information between the labels of the initial partitions and the labels of the consensus partition. The voting function, after labeling ... |

332 |
A Machine Learning Approach
- Baldi
- 2001
(Show Context)
Citation Context ...ation [17] and image processing [72] through document categorization and web mining [102]. An application field that has shown to be particularly promising for clustering techniques is bioinformatics =-=[7]-=-[13][129]. Indeed, the importance of clustering gene-expression data measured with the aid of microarray and other related technologies has grown fast and persistently over the past recent years [74][... |

332 |
A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters
- Dunn
- 1974
(Show Context)
Citation Context ...tion matrix for the data set X = {x1, …, xN}, and zm is the center of the mth cluster. The authors report some experiments in which I(k) provides better results than the Davis-Bouldin [28] and Dunn’s =-=[36]-=- indexes commonly used as relative validity criteria for clustering. Nevertheless, in a later work [11], the authors turn to use a fitness function based on the Davis-Bouldin (DB) index. A variant of ... |

328 |
A cluster separation measure
- Davies, Bouldin
- 1979
(Show Context)
Citation Context ...k x N is a partition matrix for the data set X = {x1, …, xN}, and zm is the center of the mth cluster. The authors report some experiments in which I(k) provides better results than the Davis-Bouldin =-=[28]-=- and Dunn’s [36] indexes commonly used as relative validity criteria for clustering. Nevertheless, in a later work [11], the authors turn to use a fitness function based on the Davis-Bouldin (DB) inde... |

321 |
An examination of procedures for determining the number of clusters in a data set
- Milligan, Cooper
- 1985
(Show Context)
Citation Context ...ually and strictly categorized as cluster-oriented or object-oriented. 3) Fitness Function: In principle, any relative clustering validity criterion (e.g. see Jain and Dubes [72]; Milligan and Cooper =-=[104]-=-; Halkidi et al. [55]; Handl et al. [60]) that is nonmonotonic with the number of clusters can be potentially used as a fitness function for an evolutionary algorithm designed to optimize the number o... |

261 | Survey of clustering algorithms
- Xu, Wunsch
- 2005
(Show Context)
Citation Context ...ifying the degree of dissimilarity among objects, in such a way that more similar objects have lower dissimilarity values [73]. Several dissimilarity measures can be employed for clustering tasks [72]=-=[132]-=-. Each measure has its bias and comes with its own advantages and drawbacks. Therefore, each one may be more or less suitable to a given analysis or application scenario. Indeed, it is well-known that... |

236 | Some methods for classification and analysis of multivariate observations - McQueen |

206 |
A dendrite method for cluster analysis
- Calinski, Harabasz
- 1974
(Show Context)
Citation Context ...unctions for evolutionary clustering algorithms are reviewed. Cole [23], Cowgill et al. [26], and Casillas et al. [21] use as fitness function the Calinski and Harabasz Variance Ratio Criterion (VRC) =-=[18]-=-, which is defined as: trace B/(k − 1) (6) VRC = trace W/(N − k) where B and W are the between-cluster and the pooled withincluster sums of squares (covariance) matrices, respectively. The terms N and... |

201 | Data Clustering
- Jain, Murty, et al.
- 1999
(Show Context)
Citation Context ...lapping approaches are also covered in the manuscript. It is important to stress that comprehensive surveys on clustering have been previously published, such as the outstanding papers by Jain et al. =-=[73]-=-, Jiang et al. [74], and Xu and Wunsch II [132], just to mention a few. Nevertheless, to the best of the authors’ knowledge, none has been fully devoted to evolutionary approaches. It is worth mention... |

200 |
Validity measure for fuzzy clustering
- Xie, Beni
- 1991
(Show Context)
Citation Context ...i-objective evolutionary algorithm that performs fuzzy clustering. There are two objectives being simultaneously optimized. One of them is Jm defined in (5). The other is the well-known XieBeni index =-=[131]-=-, which is essentially a ratio of a global measure of intra-cluster variation divided by a local measure of cluster separation – namely, the distance between the two closest clusters. Note that the nu... |

193 | On clustering validation techniques
- Halkidi, Batistakis, et al.
- 2001
(Show Context)
Citation Context ...egorized as cluster-oriented or object-oriented. 3) Fitness Function: In principle, any relative clustering validity criterion (e.g. see Jain and Dubes [72]; Milligan and Cooper [104]; Halkidi et al. =-=[55]-=-; Handl et al. [60]) that is nonmonotonic with the number of clusters can be potentially used as a fitness function for an evolutionary algorithm designed to optimize the number of clusters. Such crit... |

147 | Toward integrating feature selection algorithms for classification and clustering
- Liu, Yu
(Show Context)
Citation Context ...thms that are not capable of automatically distinguishing between relevant and irrelevant features for the clustering process can benefit from a number of feature selection techniques (e.g. see [78], =-=[90]-=- and references therein), as a preprocessing procedure. Some of those techniques are additionally endowed with the ability to remove redundant features (which may still impact the clustering process e... |

133 |
Genetic Algorithms and Grouping Problems
- Falkenauer
- 1998
(Show Context)
Citation Context ...erspective, clustering can be formally considered as a particular kind of NP-hard grouping 1To appear in IEEE Transactions on Systems, Man, and Cybernetics - Part C: Applications and Reviews problem =-=[43]-=-. This has stimulated the search for efficient approximation algorithms, including not only the use of ad hoc heuristics for particular classes or instances of problems, but also the use of general-pu... |

121 |
Fuzzy Modeling for Control
- Babuška
- 1998
(Show Context)
Citation Context ...is broad in scope and includes areas such as pattern classification, image segmentation, document categorization, data visualization, and dynamic systems identification, just to mention a few [16][64]=-=[5]-=-[32]. Most of the research on evolutionary algorithms for overlapping clustering has focused on algorithms that evolve fuzzy partitions of data. In this context, many authors have proposed evolutionar... |

89 | Cluster analysis for gene expression data: a survey
- Jiang, Tang, et al.
- 2004
(Show Context)
Citation Context ...s [7][13][129]. Indeed, the importance of clustering gene-expression data measured with the aid of microarray and other related technologies has grown fast and persistently over the past recent years =-=[74]-=-[60]. Clustering techniques can be broadly divided into three main types [72]: overlapping (so-called non-exclusive), partitional, and hierarchical. The last two are related to each E. R. Hruschka, R.... |

84 |
On cluster validity for the Fuzzy c-means Model
- Pal, Bezdek
- 1995
(Show Context)
Citation Context ... designed to optimize the number of clusters. Such criteria have been extensively investigated and, despite the well-known fact that their particular features make their performance problem dependent =-=[112]-=-, some of them have shown satisfactory results in several different application scenarios. In the sequel, a number of relative validity criteria that have been used as fitness functions for evolutiona... |

80 |
Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition
- Höppner, Klawonn, et al.
- 1999
(Show Context)
Citation Context ...erlapping algorithms produce data partitions that can be soft (each object fully belongs to one or more clusters) [40] or fuzzy (each object belongs to one or more clusters to different degrees) [118]=-=[64]-=-. In spite of the type of algorithm (partitional, hierarchical or overlapping), the main goal of clustering is maximizing both the homogeneity within each cluster and the heterogeneity among different... |

69 | Bezdek, "Clustering with a genetically optimized approach
- Hall, Özyurt, et al.
- 1999
(Show Context)
Citation Context ...is context, many authors have proposed evolutionary algorithms to solve fuzzy clustering problems for which the number of clusters is known or set in advance by the user [56][57][80][15][134][130][37]=-=[58]-=-[91]. However, as previously discussed in the introductory section, the optimal number of clusters is usually unknown in advance. For this reason, more recent papers have proposed to optimize both the... |

64 | A mixture model for clustering ensembles
- Topchy, Jain, et al.
- 2004
(Show Context)
Citation Context ... and Cybernetics - Part C: Applications and Reviews either the label of the class (classification) or the desired value (regression). A formal definition of cluster ensemble is given by Topchy et al. =-=[126]-=-. Given a set of P partitions ∏ = {π 1 , π 2 ,… ,π P } of a data set resulting from several applications of one or more clustering algorithms, the goal is to look for a final partition (consensus part... |

62 | Feature selection in unsupervised learning via evolutionary search
- Kim, Street, et al.
- 2000
(Show Context)
Citation Context ...algorithms that are not capable of automatically distinguishing between relevant and irrelevant features for the clustering process can benefit from a number of feature selection techniques (e.g. see =-=[78]-=-, [90] and references therein), as a preprocessing procedure. Some of those techniques are additionally endowed with the ability to remove redundant features (which may still impact the clustering pro... |

60 |
Evolutionary Computation 1: Basic Algorithms and Operators
- Back, Fogel, et al.
- 2000
(Show Context)
Citation Context ...presented by an individual, where the density of a cluster is simply the number of objects in the cluster divided by the size of the region defining that cluster. 4) Selection: Proportional selection =-=[6]-=- has been used by several authors (e.g., Krovi [84]; Lucasius et al. [96]; Murthy and Chowdhury [107]; Estivill-Castro and Murray [39]; Fränti et al. [48]; Maulik and Bandyopadhyay [100]; Kivijärvi et... |

60 |
Missing value estimation for DNA microarray gene expression data: local least squares imputation
- Kim, Golub, et al.
- 2005
(Show Context)
Citation Context ...owing important remarks: (i) Evolutionary clustering algorithms that are not capable of automatically handling incomplete data sets can benefit from a number of imputation techniques (e.g., [127][110]=-=[77]-=-), as a preprocessing procedure. In addition, if the proportion of missing values is low, just the known values may be enough for computing unbiased pairwise (dis)similarity measures. (ii) Evolutionar... |

60 |
Numerical methods for fuzzy clustering
- Ruspini
- 1970
(Show Context)
Citation Context ...e. Overlapping algorithms produce data partitions that can be soft (each object fully belongs to one or more clusters) [40] or fuzzy (each object belongs to one or more clusters to different degrees) =-=[118]-=-[64]. In spite of the type of algorithm (partitional, hierarchical or overlapping), the main goal of clustering is maximizing both the homogeneity within each cluster and the heterogeneity among diffe... |

59 | Combining multiple clusterings using evidence accumulation
- Fred, Jain
- 2005
(Show Context)
Citation Context ...e, more sophisticated strategies are needed in order to combine partitions found by different algorithms or different runs of the same algorithm in a consensus partition. According with Fred and Jain =-=[49]-=-, the partition obtained by the combination of the initial partitions should be consistent or agree in some way with them, be robust to small variations in these partitions, and be consistent with ext... |

58 |
Genetic K-means algorithm
- Krishna, Murty
- 1999
(Show Context)
Citation Context ...ms for which the number of clusters (k) is known or set up a priori (e.g., Bandyopadhyay and Maulik [10]; Estivill-Castro and Murray [39]; Fränti et al. [48]; Kivijärvi et al. [79]; Krishna and Murty =-=[83]-=-; Krovi [84]; Bezdek et al. [14]; Kuncheva and Bezdek [85]; Lu et al. [95][94]; Lucasius et al. [96]; Maulik and Bandyopadhyay [100]; Merz and Zell [103]; Murthy and Chowdhury [107]; Scheunders [121];... |

50 | Genetic algorithm-based clustering technique
- Maulik, Bandyopadhyay
- 2000
(Show Context)
Citation Context ...y [39]; Fränti et al. [48]; Kivijärvi et al. [79]; Krishna and Murty [83]; Krovi [84]; Bezdek et al. [14]; Kuncheva and Bezdek [85]; Lu et al. [95][94]; Lucasius et al. [96]; Maulik and Bandyopadhyay =-=[100]-=-; Merz and Zell [103]; Murthy and Chowdhury [107]; Scheunders [121]; Sheng and Liu [122]). Cole [23] reviews and empirically assesses a number of such genetic algorithms for clustering published up to... |

48 |
Data Mining with Neural Networks
- Bigus
- 1996
(Show Context)
Citation Context ...ermine a finite set of categories (clusters) to describe a data set according to similarities among its objects [75][40]. The applicability of clustering is manifold, ranging from market segmentation =-=[17]-=- and image processing [72] through document categorization and web mining [102]. An application field that has shown to be particularly promising for clustering techniques is bioinformatics [7][13][12... |

43 | PESA-II: Region-based selection in evolutionary multiobjective optimization
- Corne, Jerram, et al.
- 2001
(Show Context)
Citation Context ...2] also mention the use of a (µ+λ)-like deterministic/elitist selection. The evolutionary algorithm for multi-objective clustering proposed by Handl and Knowles [59] is based on the PESA-II algorithm =-=[25]-=-, 8 The intermediate sub-steps and the corresponding formulae have been omitted here for the sake of compactness. Please, refer to [97] for further details. whose selection principles rely on the inte... |

28 | Pattern recognition techniques in microarray data analysis: a survey - Valafar - 2002 |

27 |
In search of optimal clusters using genetic algorithms
- Murthy, Chowdhury
- 1996
(Show Context)
Citation Context ...; Krishna and Murty [83]; Krovi [84]; Bezdek et al. [14]; Kuncheva and Bezdek [85]; Lu et al. [95][94]; Lucasius et al. [96]; Maulik and Bandyopadhyay [100]; Merz and Zell [103]; Murthy and Chowdhury =-=[107]-=-; Scheunders [121]; Sheng and Liu [122]). Cole [23] reviews and empirically assesses a number of such genetic algorithms for clustering published up to 1997. It is intuitive to think of algorithms tha... |

27 | A Genetic C-means Clustering Algorithm Applied to Image Quantization
- Scheunders
- 1997
(Show Context)
Citation Context ...y [83]; Krovi [84]; Bezdek et al. [14]; Kuncheva and Bezdek [85]; Lu et al. [95][94]; Lucasius et al. [96]; Maulik and Bandyopadhyay [100]; Merz and Zell [103]; Murthy and Chowdhury [107]; Scheunders =-=[121]-=-; Sheng and Liu [122]). Cole [23] reviews and empirically assesses a number of such genetic algorithms for clustering published up to 1997. It is intuitive to think of algorithms that assume a fixed n... |

26 |
Genetic clustering for automatic evolution of clusters and application to image classification
- Bandyopadhyay, Maulik
(Show Context)
Citation Context ...red. Evolutionary algorithms aimed at optimizing the number of clusters (k) and the corresponding partitions are described in the works by Cole [23], Cowgill et al. [26], Bandyopadhyay and Maulik [12]=-=[11]-=-, Hruschka and Ebecken [65], Casillas et al. [21], Hruschka et al. [69][70][68], Ma et al. [97], Alves et al. [2], Tseng and Yang [128], Naldi and de Carvalho [108], Handl and Knowles [59], and Pan an... |

22 |
An evolutionary approach to multiobjective clustering
- Handl, Knowles
(Show Context)
Citation Context ... Maulik [12][11], Hruschka and Ebecken [65], Casillas et al. [21], Hruschka et al. [69][70][68], Ma et al. [97], Alves et al. [2], Tseng and Yang [128], Naldi and de Carvalho [108], Handl and Knowles =-=[59]-=-, and Pan and Cheng [113]. Falkenauer [43] describes a high-level paradigm (metaheuristic) that can be adapted to deal with grouping problems broadly defined, showing that it is useful for several app... |

21 |
Clustering with evolution strategies
- Babu, Murty
- 1994
(Show Context)
Citation Context ...n be broadly divided into two main categories. The first (and most representative) one is composed of algorithms that encode and evolve prototypes for the FCM algorithm or for one of its variants [37]=-=[4]-=-[91][115][80][15][56][57][58][99][111][89]. In this case, the prototypes are encoded and manipulated using essentially the same techniques already discussed in Section II. Essentially, the only differ... |

19 |
A genetic approach to the automatic clustering problem
- Tseng, Yang
- 2001
(Show Context)
Citation Context ...s by Cole [23], Cowgill et al. [26], Bandyopadhyay and Maulik [12][11], Hruschka and Ebecken [65], Casillas et al. [21], Hruschka et al. [69][70][68], Ma et al. [97], Alves et al. [2], Tseng and Yang =-=[128]-=-, Naldi and de Carvalho [108], Handl and Knowles [59], and Pan and Cheng [113]. Falkenauer [43] describes a high-level paradigm (metaheuristic) that can be adapted to deal with grouping problems broad... |

18 | A consensus framework for integrating distributed clusterings under limited knowledge sharing
- Ghosh, Strehl, et al.
- 2002
(Show Context)
Citation Context ...e correspondence of labels for different partitions is not simple. In spite of the difficulty associated with this issue, there are several works investigating the ensemble of partitions [49][86][124]=-=[51]-=-[76][133][42]. However, only the last two works use a genetic algorithm for producing a clustering ensemble. Another of these works [76] uses genetic algorithms as one of the individual clustering alg... |