## Parallel Algorithms for Hierarchical Clustering (1995)

### Cached

### Download Links

- [www.ece.nwu.edu]
- [robotics.jpl.nasa.gov]
- DBLP

### Other Repositories/Bibliography

Venue: | Parallel Computing |

Citations: | 84 - 1 self |

### BibTeX

@ARTICLE{Olson95parallelalgorithms,

author = {Clark F. Olson},

title = {Parallel Algorithms for Hierarchical Clustering},

journal = {Parallel Computing},

year = {1995},

volume = {21},

pages = {1313--1325}

}

### Years of Citing Articles

### OpenURL

### Abstract

Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem [3, 4, 10, 18]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering using several distance metrics are then described. Optimal PRAM algorithms using n log n processors are given for the average link, complete link, centroid, median, and minimum variance metrics. Optimal butterfly and tree algorithms using n log n processors are given for the centroid, median, and minimum variance metrics. Optimal asymptotic speedups are achieved for the best practical algorithm to perform clustering using the single link metric on a n log n processor PRAM, butterfly, or tree. Keywords. Hierarchical clustering, pattern analysis, parallel algorithm, butterfly network, PRAM algorithm. 1 In...

### Citations

2544 | The Design and Analysis of Computer Algorithms - AHO, IIOPCROFT, et al. - 1974 |

1596 | A note on two problems in connexion with graphs
- Dijkstra
- 1959
(Show Context)
Citation Context ...eue) and the operations of decreasing the value of a key can be performed in O(1) time. The use of this data structure allows the parallel implementation of Dijkstra's minimum spanning tree algorithm =-=[8]-=- (also known as Prim's algorithm) in O(n log n) time using m n log n processors, where n is the number of nodes in the graph and m is the number of edges. Since the clustering problem must consider th... |

1365 | Introduction to Parallel Algorithms and Architectures: Arrays Trees Hypercubes - Leighton - 1992 |

889 |
W.: A note on two problems in connection with graphs
- Dijkstra
- 1959
(Show Context)
Citation Context ...ed in O(log n) time and the decreasing the value of a key to be performed in O(1) time. The use of this data structure allows the parallel implementation of Dijkstra's minimal spanning tree algorithm =-=[5]-=- in O(n log n) time using m n log n processors on a PRAM, where n is the number of vertices in the graph and m is the number of edges. Bruynooghe [2] describes a parallel implementation of the nearest... |

631 | Generalizing the Hough transform to detect arbitrary shapes - Ballard - 1981 |

573 |
Shortest connection networks and some generalizations
- Prim
- 1957
(Show Context)
Citation Context ...ing using the single link metric and the minimum variance metric on a SIMD array processor. They have implemented parallel versions of the SLINK algorithm [18], Prim's minimal spanning tree algorithm =-=[13]-=-, and Ward's minimum variance method [19]. Their parallel implementations of the SLINK algorithm and Ward's minimum variance algorithm do not decrease the O(n 2 ) time required by the serial implement... |

473 |
On the Shortest Spanning Subtree of a Graph and the Traveling Salesman
- Kruskal
(Show Context)
Citation Context ...hms for hierarchical clustering using the single link metric on an n-node hypercube and an n-node butterfly. Their algorithms are parallel implementations of Kruskal's minimal spanning tree algorithm =-=[7]-=- and run in O(n log n) time on the hypercube and O(n log 2 n) on the butterfly, but in fact the algorithms appear to have a fatal flaw causing incorrect operation. In their algorithm, each processor s... |

192 |
On constructing minimum spanning trees in k-dimensional space and related problems
- Yao
- 1982
(Show Context)
Citation Context ...uire the same computational complexity since the minimal spanning tree can easily be transformed into the cluster hierarchy. While o(n 2 ) algorithms exist to find the Euclidean minimal spanning tree =-=[20]-=-, these algorithms are impractical when the dimensionality of the cluster space d ? 2. Figure 2 gives a practical algorithm for the single link metric. Computing arrays storing each D(i; j) and N(i) r... |

142 |
Mathematical Theory of Connecting Networks and Telephone Traffic
- Benes
- 1965
(Show Context)
Citation Context ... consider O(n 2 ) possible permutations (corresponding to the n(n \Gamma 1)=2 pairs of clusters we could merge,) we can compute deterministic O(log n) time routing schedules for each of them off-line =-=[1]-=-. These schedules are then indexed by the numbers of the clusters that are merged. Thus, we have an efficient parallel algorithm for general algorithms on a butterfly network, but we now require compu... |

122 |
A Survey of Recent Advances in Hierarchical Clustering Algorithms
- Murtagh
- 1983
(Show Context)
Citation Context ...3 clarko@cs.cornell.edu Abstract Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem =-=[3, 4, 10, 18]-=-. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering ... |

111 |
Slink: An optimally efficient algorithm for the single link cluster methods. The Computer Journal 16(1):30–34
- Sibson
- 1973
(Show Context)
Citation Context ...3 clarko@cs.cornell.edu Abstract Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem =-=[3, 4, 10, 18]-=-. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering ... |

107 |
A general theory of classificatory sorting strategies: I. Hierarchical systems
- Lance, Williams
- 1966
(Show Context)
Citation Context ...distances from each point to the center of its cluster that would be caused by agglomerating the clusters. Useful clustering metrics can usually be described using the Lance-Williams updating formula =-=[8]-=-. The distance from the new cluster i + j to any other cluster k is given by: d(i + j; k) = a(i)d(i; k) + a(j)d(j; k) + bd(i; j) + c jd(i; k) \Gamma d(j; k)j Table 1 gives the coefficients in the Lanc... |

102 | Three-dimensional model matching from an unconstrained viewpoint - Thompson, Mundy - 1987 |

84 | Object recognition and localization via pose clustering - STOCKMAN - 1987 |

80 |
Efficient algorithms for agglomerative hierarchical clustering methods
- Day, Edelsbrunner
- 1984
(Show Context)
Citation Context ...3 clarko@cs.cornell.edu Abstract Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem =-=[3, 4, 10, 18]-=-. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering ... |

79 |
Relaxed heaps: an alternative to Fibonacci heaps with applications to parallel computation
- Driscoll
- 1988
(Show Context)
Citation Context ... from the new cluster to each of the other clusters. If this step is added to their algorithms in a straightforward manner, the times required by their algorithms increase to O(n 2 ). Driscoll et al. =-=[6]-=- have described a useful data structure called the relaxed heap and they have shown how it can be applied to the parallel computation of minimal spanning trees. The relaxed heap is a data structure fo... |

72 |
Multidimensional Clustering Algorithms. Heidelberg and Vienna
- Murtagh
- 1985
(Show Context)
Citation Context ... \Gamma d(j; k)j Table 1 gives the coefficients in the Lance-Williams updating formula for the metrics described above. O(n 2 ) time algorithms exist to perform clustering using each of these metrics =-=[3, 4, 11, 18]-=-. Any metric that can be described by the Lance-Williams updating formula can be performed in O(n 2 log n) time [3]. This paper reviews several important sequential algorithms and discusses previous w... |

56 |
Hierarchical grouping to optimize an objective function
- Jr, H
- 1963
(Show Context)
Citation Context ...minimum variance metric on a SIMD array processor. They have implemented parallel versions of the SLINK algorithm [18], Prim's minimal spanning tree algorithm [13], and Ward's minimum variance method =-=[19]-=-. Their parallel implementations of the SLINK algorithm and Ward's minimum variance algorithm do not decrease the O(n 2 ) time required by the serial implementation, but a significant constant factor ... |

56 | Pose determination of a three-dimensional object using triangle pairs - Linnainmaa, Harwood, et al. - 1988 |

42 | An efficient algorithm for a complete link method. The Computer Journal 20(4):364–366 - Defays - 1977 |

34 | Programming a hypercube multicomputer
- Ranka, Won, et al.
- 1988
(Show Context)
Citation Context ...arallel algorithms for hierarchical clustering. In addition, there has been much recent work on parallelizing partitional clustering algorithms (another popular type of clustering.) See, for example, =-=[14, 16, 21]-=-. Rasmussen and Willett [15] discuss parallel implementations of clustering using the single link metric and the minimum variance metric on a SIMD array processor. They have implemented parallel versi... |

24 | Optimal expected time algorithms for closest point problems - Bentley, Weide, et al. - 1980 |

16 | Time and space efficient pose clustering - Olson - 1994 |

15 |
Parallel clustering algorithms
- Li, Fang
- 1989
(Show Context)
Citation Context ... perform clustering efficiently in the general case. achieved. Their parallel implementation of Prim's minimal spanning tree algorithm achieves O(n log n) time with sufficient processors. Li and Fang =-=[9]-=- describe algorithms for hierarchical clustering using the single link metric on an n-node hypercube and an n-node butterfly. Their algorithms are parallel implementations of Kruskal's minimal spannin... |

11 |
Algorithm 76. Hierarchical clustering using the minimum spanning tree
- Rohlf
- 1973
(Show Context)
Citation Context ..., the loop requires O(log n) time to find the minimum A(j) and thus O(n log n) time overall on a butterfly or tree. The minimal spanning tree can then easily be transformed into the cluster hierarchy =-=[11, 17]-=-. Figure 9 shows how the trees are built in these algorithms. 4.2 Centroid and median metrics The single link PRAM algorithm can also be used for the centroid and median metrics with small modificatio... |

8 |
Efficiency of hierarchic agglomerative clustering using the ICL distributed array processor
- Willett
- 1989
(Show Context)
Citation Context ...clustering. In addition, there has been much recent work on parallelizing partitional clustering algorithms (another popular type of clustering.) See, for example, [14, 16, 21]. Rasmussen and Willett =-=[15]-=- discuss parallel implementations of clustering using the single link metric and the minimum variance metric on a SIMD array processor. They have implemented parallel versions of the SLINK algorithm [... |

6 |
A General Theory of Classi…catory Sorting Strategies,”Computer Journal
- Lance, Williams
- 1967
(Show Context)
Citation Context ...um of squared distances from each point to the center of its cluster caused by agglomerating the clusters. Useful clustering metrics can usually be described using the Lance-Williams updating formula =-=[12]-=-: d(i + j; k) =a(i)d(i; k)+a(j)d(j; k)+bd(i; j)+c j d(i; k) , d(j; k) j So, the distance between a new cluster (combining the two previous clusters i and j) and the cluster k is a function of the prev... |

5 | A Probabilistic Minimum Spanning Tree Algorithm, IBM Research Report C6502 - Rohlf - 1977 |

4 |
Parallel squared error clustering on hypercube arrays
- Rivera, Ismail, et al.
- 1990
(Show Context)
Citation Context ...arallel algorithms for hierarchical clustering. In addition, there has been much recent work on parallelizing partitional clustering algorithms (another popular type of clustering.) See, for example, =-=[14, 16, 21]-=-. Rasmussen and Willett [15] discuss parallel implementations of clustering using the single link metric and the minimum variance metric on a SIMD array processor. They have implemented parallel versi... |

4 |
Parallel fuzzy clustering on fixed size hypercube SIMD computers
- Zapata, Rivera, et al.
- 1989
(Show Context)
Citation Context ...arallel algorithms for hierarchical clustering. In addition, there has been much recent work on parallelizing partitional clustering algorithms (another popular type of clustering.) See, for example, =-=[14, 16, 21]-=-. Rasmussen and Willett [15] discuss parallel implementations of clustering using the single link metric and the minimum variance metric on a SIMD array processor. They have implemented parallel versi... |

4 | Expected-time complexity results for hierarchic clustering algorithms which use cluster centres - Murtagh - 1983 |

3 | Classification ascendante hiérarchique des grands ensembles de données: un algorithme rapide fondé sur la construction des voisinages réductibles, Les Cahiers de l’Analyse des Données 3 - Bruynooghe - 1978 |

3 |
Time and space e cient pose clustering
- Olson
- 1994
(Show Context)
Citation Context ...duction Clustering of multi-dimensional data is required in many elds. For example, in model-based object recognition, pose clustering is used to determine possible locations of an object in an image =-=[25, 26, 19]-=-. Some possible methods of clustering data are: 1. Hierarchical Clustering: These methods start with each point being considered a cluster and recursively combine pairs of clusters (subsequently updat... |

2 |
Parallel implementation of fast clustering algorithms
- Bruynooghe
- 1989
(Show Context)
Citation Context ...tion of Dijkstra's minimal spanning tree algorithm [5] in O(n log n) time using m n log n processors on a PRAM, where n is the number of vertices in the graph and m is the number of edges. Bruynooghe =-=[2]-=- describes a parallel implementation of the nearest neighbors clustering algorithm suitable for a parallel supercomputer. At each step, this algorithm dispatches tasks to determine the nearest neighbo... |

2 |
SLINK: An optimally e cient algorithm for the single link cluster method
- Sibson
- 1973
(Show Context)
Citation Context ...is a common method used to determine clusters of similar data points in multi-dimensional spaces. O(n 2 ) algorithms, where n is the number of points to cluster, have long been known for this problem =-=[24,7,6]-=-. This paper discusses parallel algorithms to perform hierarchical clustering using various distance metrics. I describe O(n) time algorithms for clustering using the single link, average link, comple... |

1 |
The Design and Analysis of Computer Algoriths
- Aho, Hopcroft, et al.
- 1974
(Show Context)
Citation Context ...r each cluster to keep track of which clusters are closest to each. Since we only need to know which cluster is the closest at each step this can easily be implemented using a heap (see, for example, =-=[1]-=-.) We can create a priority queue in O(n) time on a single processor, but we must now update each priority queue after each agglomeration, a step that takes O(log n) time on a PRAM. This algorithm thu... |

1 |
Classi cation ascendante hierarchique, des grands ensembles de donnees: un algorithme rapide fonde sur la construction des voisinages reductibles. Les Cahiers de l'Analyse de Donnees, III:7{33
- Bruynooghe
- 1978
(Show Context)
Citation Context ...e reducibility property An O(n 2 ) algorithm using nearest neighbor chains is given next. To produce exact results, this algorithm requires that the distance metric satis es the reducibility property =-=[5]-=-. The reducibility property requires that if the following distance constraints hold for clusters i, j, and k for some distance : d(i; j) < d(i; k) > d(j; k) > then we must have for the agglomerated c... |