Results 1 -
7 of
7
Parallel Algorithms for Hierarchical Clustering
- Parallel Computing
, 1995
"... Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem [3, 4, 10, 18]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms f ..."
Abstract
-
Cited by 69 (1 self)
- Add to MetaCart
Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem [3, 4, 10, 18]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering using several distance metrics are then described. Optimal PRAM algorithms using n log n processors are given for the average link, complete link, centroid, median, and minimum variance metrics. Optimal butterfly and tree algorithms using n log n processors are given for the centroid, median, and minimum variance metrics. Optimal asymptotic speedups are achieved for the best practical algorithm to perform clustering using the single link metric on a n log n processor PRAM, butterfly, or tree. Keywords. Hierarchical clustering, pattern analysis, parallel algorithm, butterfly network, PRAM algorithm. 1 In...
Time and Space Efficient Pose Clustering
- In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, 1994
"... This paper shows that the pose clustering method of object recognition can be decomposed into small subproblems without loss of accuracy. Randomization can then be used to limit the number of subproblems that need to be examined to achieve accurate recognition. These techniques are used to decrease ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
This paper shows that the pose clustering method of object recognition can be decomposed into small subproblems without loss of accuracy. Randomization can then be used to limit the number of subproblems that need to be examined to achieve accurate recognition. These techniques are used to decrease the computational complexity of pose clustering. The clustering step is formulated as an efficient tree search of the pose space. This method requires little memory since not many poses are clustered at a time. Analysis shows that pose clustering is not inherently more sensitive to noise than other methods of generating hypotheses. Finally, experiments on real and synthetic data are presented. 1 Introduction Model-based object recognition systems determine which objects appear in images using a catalog of object models and estimate their positions and orientations (poses) relative to the camera. This paper examines methods of improving the efficiency of the pose clustering method of object ...
Computer Vision Algorithms on Reconfigurable Logic Arrays
- IEEE TRANS. ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1999
"... Computer vision algorithms are natural candidates for high performance computing due to their inherent parallelism and intense computational demands. For example, a simple 3 x 3 convolution on a 512 x 512 gray scale image at 30 frames per second requires 67.5 million multiplications and 60 million a ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Computer vision algorithms are natural candidates for high performance computing due to their inherent parallelism and intense computational demands. For example, a simple 3 x 3 convolution on a 512 x 512 gray scale image at 30 frames per second requires 67.5 million multiplications and 60 million additions to be performed in one second. Computer vision tasks can be classified into three categories based on their computational complexity andcommunication complexity: low-level, intermediate-level and high-level. Special-purpose hardware provides better performance compared to a general-purpose hardware for all the three levels of vision tasks. With recent advances in very large scale integration (VLSI) technology, an application specific integrated circuit (ASIC) can provide the best performance in terms of total execution time. However, long design cycle time, high development cost and inflexibility of a dedicated hardware deter design of ASICs. In contrast, field programmable gate arrays (FPGAs) support lower design verification time and easier design adaptability atalower cost. Hence, FPGAs with an array of reconfigurable logic blocks canbevery useful compute elements. FPGA-based custom computing machines are
Improving the Orthogonal Range Search k-windows Algorithm
- In Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
, 2002
"... Clustering, that is the partitioning of a set of patterns into disjoint and homogeneous meaningful groups (clusters) , is a fundamental process in the practice of science. k-windows is an efficient clustering algorithm that reduces the number of patterns that need to be examined for similarity, usin ..."
Abstract
-
Cited by 7 (7 self)
- Add to MetaCart
Clustering, that is the partitioning of a set of patterns into disjoint and homogeneous meaningful groups (clusters) , is a fundamental process in the practice of science. k-windows is an efficient clustering algorithm that reduces the number of patterns that need to be examined for similarity, using a windowing technique. It is based on well known spatial data structures, namely the range tree, that allows fast range searches. From a theoretical standpoint, the k- windows algorithm has a lower time complexity than the other well-known existing clustering algorithms. Moreover, it achieves high quality clustering results. However,it seems that it would not be directly applicable in high-dimensional practical settings due to the superlinear space requirements for the range tree. In this paper we present an improvement of the k-windows algorithm, aiming at attacking this problem, that it is based on a different solution to the orthogonal range search problem.
Parallelism in knowledge discovery techniques
- LNCS 2367: Applied Parallel Computing, 6th International Conference PARA’02
, 2002
"... Abstract. Knowledge discovery in databases or data mining is the semiautomated analysis of large volumes of data, looking for the relationships and knowledge that are implicit in large volumes of data and are ’interesting’ in the sense of impacting an organization’s practice. Data mining and knowled ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Knowledge discovery in databases or data mining is the semiautomated analysis of large volumes of data, looking for the relationships and knowledge that are implicit in large volumes of data and are ’interesting’ in the sense of impacting an organization’s practice. Data mining and knowledge discovery on large amounts of data can benefit of the use of parallel computers both to improve performance and quality of data selection. This paper presents and discusses different forms of parallelism that can be exploited in data mining techniques and algorithms. For the main data mining techniques, such as rule induction, clustering algorithms, decision trees, genetic algorithms, and neural networks, the possible ways to exploit parallelism are presented and discussed in detail. Finally, some promising research directions in the parallel data mining research area are outlined. 1
Vectorization and Parallelization of Clustering Algorithms
- VI Spanish Symposium on Pattern Recognition and Image Analysis
, 1995
"... In this work we present a study on the parallelization of code segments that are typical of clustering algorithms. In order to approach this problem from a practical point of view we have considered the parallelization on the three types of architectures currently available from parallel system manu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this work we present a study on the parallelization of code segments that are typical of clustering algorithms. In order to approach this problem from a practical point of view we have considered the parallelization on the three types of architectures currently available from parallel system manufacturers: vector computers, shared memory multiprocessors and distributed memory multicomputers. We have selected the FC (Fuzzy Covariance) and AD (Affinity Decompositions) algorithms as representative of the different computational structures found in clustering algorithms. We present a comparative study of the results obtained from running these algorithms on three systems: VP2400/10, KSR-1 and AP1000. 1 Introduction The automatic classification of data is one of the basic tasks in pattern recognition. Given its iterative nature and high computational cost (CPU time), the most adequate solution for its numerical treatment is to use concurrent techniques in order to reduce the execution ...
Data Mining and Knowledge Discovery, 3, 263–290 (1999) c ○ 1999 Kluwer Academic Publishers. Manufactured in The Netherlands. A Fast Parallel Clustering Algorithm for Large Spatial Databases
"... Abstract. The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we present PDBSCAN, a parallel version of this algorithm. We use the ‘shared-nothing ’ architecture with mult ..."
Abstract
- Add to MetaCart
Abstract. The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we present PDBSCAN, a parallel version of this algorithm. We use the ‘shared-nothing ’ architecture with multiple computers interconnected through a network. A fundamental component of a shared-nothing system is its distributed data structure. We introduce the dR∗-tree, a distributed spatial index structure in which the data is spread among multiple computers and the indexes of the data are replicated on every computer. We implemented our method using a number of workstations connected via Ethernet (10 Mbit). A performance evaluation shows that PDBSCAN offers nearly linear speedup and has excellent scaleup and sizeup behavior.

