## Performance Evaluation of Some Clustering Algorithms and Validity Indices (2002)

Venue: | IEEE Transactions on Pattern Analysis and Machine Intelligence |

Citations: | 59 - 1 self |

### BibTeX

@ARTICLE{Maulik02performanceevaluation,

author = {Ujjwal Maulik and Sanghamitra B},

title = {Performance Evaluation of Some Clustering Algorithms and Validity Indices},

journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},

year = {2002},

volume = {24},

pages = {1650--1654}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract—In this article, we evaluate the performance of three clustering algorithms, hard K-Means, single linkage, and a simulated annealing (SA) based technique, in conjunction with four cluster validity indices, namely Davies-Bouldin index, Dunn’s index, Calinski-Harabasz index, and a recently developed index I. Based on a relation between the index I and the Dunn’s index, a lower bound of the value of the former is theoretically estimated in order to get unique hard K-partition when the data set has distinct substructures. The effectiveness of the different validity indices and clustering methods in automatically evolving the appropriate number of clusters is demonstrated experimentally for both artificial and real-life data sets with the number of clusters varying from two to ten. Once the appropriate number of clusters is determined, the SA-based clustering technique is used for proper partitioning of the data into the said number of clusters.

### Citations

3764 | Optimization by simulated annealing
- Kirkpatrick, Gelatt, et al.
- 1983
(Show Context)
Citation Context ...62-8828/02/$17.00 ß 2002 IEEE conjunction with three clustering algorithms viz. the well-known K-means and single linkage algorithms [1], [2], as well as a recently developed simulated annealing (SA) =-=[13]-=-, [14] based clustering scheme. The number of clusters is varied from Kmin to Kmax for K-means and the simulated annealing-based clustering algorithms, while, for single linkage algorithm (which incor... |

2251 |
Dubes. Algorithms for Clustering Data
- Jain, C
- 1988
(Show Context)
Citation Context ... Terms—Unsupervised classification, Euclidean distance, K-Means algorithm, single linkage algorithm, validity index, simulated annealing. 1 INTRODUCTION æ THE purpose of any clustering technique [1], =-=[2]-=-, [3], [4], [5] is to evolve a K n partition matrix UðXÞ of a data set X (X fx1;x2; ...;xng) inR N , representing its partitioning into a number, say K, of clusters (C1;C2; ...;CK). The partition mat... |

648 |
Applied Multivariate Statistical Analysis
- Johnson, Wichern
- 1992
(Show Context)
Citation Context ...r class one is 0; 2Š 0; 2Š 0; 2Š ...10times and that for class two is 1; 3Š 0; 2Š 0; 2Š ...9 times. Two real-life data sets considered are Crude_Oil and Cancer. Crude_Oil is an overlapping data =-=[17]-=- having 56 data points, five features, and three classes. The nine-dimensional Wisconsin breast cancer data (Cancer) (http://www.ics.uci.edu/ mlearn/MLRepository.html) is used for the purpose of demon... |

332 |
A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters
- Dunn
- 1974
(Show Context)
Citation Context ...egies in [7]. Some more clustering algorithms may be found in [8], [9]. In this paper, we aim to evaluate the performance of four validity indices, namely, the Davies-Bouldin index [10], Dunn’s index =-=[11]-=-, CalinskiHarabasz index [12], and a recently developed index I, in . U. Maulik is with the Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019. E-mai... |

328 |
A cluster separation measure
- Davies, Bouldin
- 1979
(Show Context)
Citation Context ...nitialization strategies in [7]. Some more clustering algorithms may be found in [8], [9]. In this paper, we aim to evaluate the performance of four validity indices, namely, the Davies-Bouldin index =-=[10]-=-, Dunn’s index [11], CalinskiHarabasz index [12], and a recently developed index I, in . U. Maulik is with the Department of Computer Science and Engineering, University of Texas at Arlington, Arlingt... |

321 |
An examination of procedures for determining the number of clusters in a data set
- Milligan, Cooper
- 1985
(Show Context)
Citation Context ... and 2) how real or good is the clustering itself. That is, whatever the clustering method may be, one has to determine the number of clusters and also the goodness or validity of the clusters formed =-=[6]-=-. The measure of validity of the clusters should be such that it will be able to impose an ordering of the clusters in terms of its goodness. In other words, if U1;U2; ...;Um is m partitions of X and ... |

300 | How Many Clusters? Which Clustering Method? Answers via Model-Based Cluster Analysis
- Fraley
- 1998
(Show Context)
Citation Context ...sing only hierarchical clustering algorithms. Meila and Heckerman provide a comparison of some clustering methods and initialization strategies in [7]. Some more clustering algorithms may be found in =-=[8]-=-, [9]. In this paper, we aim to evaluate the performance of four validity indices, namely, the Davies-Bouldin index [10], Dunn’s index [11], CalinskiHarabasz index [12], and a recently developed index... |

262 |
Pattern Recognition Principles
- Tou, Gonzalez
- 1974
(Show Context)
Citation Context ...Index Terms—Unsupervised classification, Euclidean distance, K-Means algorithm, single linkage algorithm, validity index, simulated annealing. 1 INTRODUCTION æ THE purpose of any clustering technique =-=[1]-=-, [2], [3], [4], [5] is to evolve a K n partition matrix UðXÞ of a data set X (X fx1;x2; ...;xng) inR N , representing its partitioning into a number, say K, of clusters (C1;C2; ...;CK). The partitio... |

206 |
A dendrite method for cluster analysis
- Calinski, Harabasz
- 1974
(Show Context)
Citation Context ...ering algorithms may be found in [8], [9]. In this paper, we aim to evaluate the performance of four validity indices, namely, the Davies-Bouldin index [10], Dunn’s index [11], CalinskiHarabasz index =-=[12]-=-, and a recently developed index I, in . U. Maulik is with the Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019. E-mail: maulik@cse.uta.edu. . S. B... |

200 |
Validity measure for fuzzy clustering
- Xie, Beni
- 1991
(Show Context)
Citation Context ...e with and balance each other critically. The power p is used to control the contrast between the different cluster configurations. In this article, we have taken p 2. Xie and Beni defined an index =-=[15]-=- that is a ratio of the compactness of the fuzzy K-partition of a data set to its separation s. Mathematically, the Xie Beni (XB) index may be formulated as: PK Pn k1 j1 XB u2 kjjjxj zkjj 2 n mini... |

128 |
separated clusters and optimal fuzzy partitions. J.Cybern
- Dunn, Well
- 1974
(Show Context)
Citation Context ... PK k1 nk min k n min, so, XB I n : It is proven in [15] that XB : Therefore, 1 2 D I 1 2 D min : ð10Þ n Evidently, index I becomes arbitrarily large as D grows without bound. It has been proven in =-=[16]-=- that, if D > 1, the hard Kpartition is unique. Therefore, if the data sets have a distinct substructure and the clustering algorithm found it, then the corresponding I min n . 4 EXPERIMENTAL RESULTS ... |

90 | A robust competitive clustering algorithm with applocations in computer vision - Frigui, Krishnapuram - 1999 |

83 | An experimental comparison of several clustering and initialization methods
- Meila, Heckerman
- 1998
(Show Context)
Citation Context ...s containing distinct nonoverlapping clusters while using only hierarchical clustering algorithms. Meila and Heckerman provide a comparison of some clustering methods and initialization strategies in =-=[7]-=-. Some more clustering algorithms may be found in [8], [9]. In this paper, we aim to evaluate the performance of four validity indices, namely, the Davies-Bouldin index [10], Dunn’s index [11], Calins... |

69 | Bezdek, "Clustering with a genetically optimized approach
- Hall, Özyurt, et al.
- 1999
(Show Context)
Citation Context ...only hierarchical clustering algorithms. Meila and Heckerman provide a comparison of some clustering methods and initialization strategies in [7]. Some more clustering algorithms may be found in [8], =-=[9]-=-. In this paper, we aim to evaluate the performance of four validity indices, namely, the Davies-Bouldin index [10], Dunn’s index [11], CalinskiHarabasz index [12], and a recently developed index I, i... |

28 |
Sanghamitra Bandyopadhyay, “Genetic algorithm-based clustering technique
- Maulik
(Show Context)
Citation Context ...ised classification, Euclidean distance, K-Means algorithm, single linkage algorithm, validity index, simulated annealing. 1 INTRODUCTION æ THE purpose of any clustering technique [1], [2], [3], [4], =-=[5]-=- is to evolve a K n partition matrix UðXÞ of a data set X (X fx1;x2; ...;xng) inR N , representing its partitioning into a number, say K, of clusters (C1;C2; ...;CK). The partition matrix UðXÞ may be... |

14 |
Partitional clustering using simulated annealing with probabilistic redistribution
- Bandyopadhyay, Maulik, et al.
- 2001
(Show Context)
Citation Context ...8/02/$17.00 ß 2002 IEEE conjunction with three clustering algorithms viz. the well-known K-means and single linkage algorithms [1], [2], as well as a recently developed simulated annealing (SA) [13], =-=[14]-=- based clustering scheme. The number of clusters is varied from Kmin to Kmax for K-means and the simulated annealing-based clustering algorithms, while, for single linkage algorithm (which incorporate... |