Results 1 - 10
of
24
Parallel Algorithms for Hierarchical Clustering
- Parallel Computing
, 1995
"... Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem [3, 4, 10, 18]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms f ..."
Abstract
-
Cited by 69 (1 self)
- Add to MetaCart
Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n 2 ) algorithms are known for this problem [3, 4, 10, 18]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering using several distance metrics are then described. Optimal PRAM algorithms using n log n processors are given for the average link, complete link, centroid, median, and minimum variance metrics. Optimal butterfly and tree algorithms using n log n processors are given for the centroid, median, and minimum variance metrics. Optimal asymptotic speedups are achieved for the best practical algorithm to perform clustering using the single link metric on a n log n processor PRAM, butterfly, or tree. Keywords. Hierarchical clustering, pattern analysis, parallel algorithm, butterfly network, PRAM algorithm. 1 In...
The Generalized Dimension Exchange Method for Load Balancing in k-ary n-cubes and Variants
, 1995
"... The Generalized Dimension Exchange (GDE) method is a fully distributed load balancing method that operates in a relaxation fashion for multicomputers with a direct communication network. It is parameterized by an exchange parameter that governs the splitting of load between a pair of directly conne ..."
Abstract
-
Cited by 42 (9 self)
- Add to MetaCart
The Generalized Dimension Exchange (GDE) method is a fully distributed load balancing method that operates in a relaxation fashion for multicomputers with a direct communication network. It is parameterized by an exchange parameter that governs the splitting of load between a pair of directly connected processors during load balancing. An optimal would lead to the fastest convergence of the balancing process. Previous work has resulted in the optimal for the binary n-cubes. In this paper, we derive the optimal 's for the k-ary n-cube network and its variants---the ring, the torus, the chain, and the mesh. We establish the relationships between the optimal convergence rates of the method when applied to these structures, and conclude that the GDE method favors high dimensional k-ary n-cubes. We also reveal the superiority of the GDE method to another relaxation-based method, the diffusion method. We further show through statistical simulations that the optimal 's do speed up the GDE...
Analysis of The Generalized Dimension Exchange Method for Dynamic Load Balancing
- Journal of Parallel and Distributed Computing
, 1992
"... The dimension exchange method is a distributed load balancing method for point-to-point networks. We add a parameter, called the exchange parameter, to the method to control the splitting of load between a pair of directly connected processors, and call this parameterized version the generalized di ..."
Abstract
-
Cited by 40 (7 self)
- Add to MetaCart
The dimension exchange method is a distributed load balancing method for point-to-point networks. We add a parameter, called the exchange parameter, to the method to control the splitting of load between a pair of directly connected processors, and call this parameterized version the generalized dimension exchange (GDE) method. The rationale for the introduction of this parameter is that splitting the workload into equal halves does not necessarily lead to an optimal result (in terms of the convergence rate) for certain structures. We carry out an analysis of this new method, emphasizing on its termination aspects and potential efficiency. Given a specific structure, one needs to determine a value to use for the exchange parameter that would lead to an optimal result. To this end, we first derive a sufficient and necessary condition for the termination of the method. We then show that equal splitting, proposed originally by others as a heuristic strategy, indeed yields optimal efficie...
Large-scale parallel data clustering
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... Abstract—Algorithmic enhancements are described that enable large computational reduction in mean square-error data clustering. These improvements are incorporated into a parallel data-clustering tool, P-CLUSTER, designed to execute on a network of workstations. Experiments involving the unsupervise ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
Abstract—Algorithmic enhancements are described that enable large computational reduction in mean square-error data clustering. These improvements are incorporated into a parallel data-clustering tool, P-CLUSTER, designed to execute on a network of workstations. Experiments involving the unsupervised segmentation of standard texture images were performed. For some data sets, a 96 percent reduction in computation was achieved. Index Terms—Data clustering, mean square error, data mining, image segmentation, parallel algorithm, network of workstations. ——————— — F ———————— 1
On Runtime Parallel Scheduling for Processor Load Balancing
- IEEE Trans. Parallel and Distributed Systems
, 1997
"... Parallel scheduling is a new approach for load balancing. In parallel scheduling, all processors cooperate to schedule work. Parallel scheduling is able to accurately balance the load by using global load information at compile-time or runtime. It provides high-quality load balancing. This paper pre ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
Parallel scheduling is a new approach for load balancing. In parallel scheduling, all processors cooperate to schedule work. Parallel scheduling is able to accurately balance the load by using global load information at compile-time or runtime. It provides high-quality load balancing. This paper presents an overview of the parallel scheduling technique. Scheduling algorithms for tree, hypercube, and mesh networks are presented. These algorithms can fully balance the load and maximize locality 1. Introduction Static scheduling balances the workload before runtime and can be applied to problems with a predictable structure, which are called static problems. Dynamic scheduling performs scheduling activities concurrently at runtime, which applies to problems with an unpredictable structure, which are called dynamic problems. Static scheduling utilizes the knowledge of problem characteristics to reach a well-balanced load [1, 2, 3, 4]. However, it is not able to balance the load for dynami...
Iterative Dynamic Load Balancing in Multicomputers
- Journal of Operational Research Society
, 1994
"... Dynamic load balancing in multicomputers can improve the utilization of processors and the efficiency of parallel computations through migrating workload across processors at runtime. We present a survey and critique of dynamic load balancing strategies that are iterative: workload migration is car ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Dynamic load balancing in multicomputers can improve the utilization of processors and the efficiency of parallel computations through migrating workload across processors at runtime. We present a survey and critique of dynamic load balancing strategies that are iterative: workload migration is carried out through transferring processes across nearest neighbor processors. Iterative strategies have become prominent in recent years because of the increasing popularity of point-to-point interconnection networks for multicomputers. Key words: dynamic load balancing, multicomputers, optimization, queueing theory, scheduling. INTRODUCTION Multicomputers are highly concurrent systems that are composed of many autonomous processors connected by a communication network 1;2 . To improve the utilization of the processors, parallel computations in multicomputers require that processes be distributed to processors in such a way that the computational load is evenly spread among the processors...
Nearest Neighbor Algorithms for Load Balancing in Parallel Computers
, 1995
"... With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly well-known nearest neighbor algorithms, the dimension-exchange (DE, for shor ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly well-known nearest neighbor algorithms, the dimension-exchange (DE, for short) and the diffusion (DF, for short) methods and their several variants---the average dimension-exchange (ADE), the optimally-tuned dimension-exchange (ODE), the local average diffusion (ADF) and the optimally-tuned diffusion (ODF). The measures of interest are their efficiency in driving any initial workload distribution to a uniform distribution and their ability in controlling the growth of the variance among the processors' workloads. The comparison is made with respect to both one-port and all-port communication architectures and in consideration of various implementation strategies including synchronous/asynchronous invocation policies and static/dynamic random workload behaviors. It t...
Parallel Remapping Algorithms for Adaptive Problems
- PROC. OF THE SYMP. ON THE FRONTIERS OF MASSIVELY PARALLEL COMPUTATION
, 1995
"... In this paper we present fast parallel algorithms for remapping a class of irregular and adaptive problems on coarse-grained distributed-memory machines. We show that the remapping of these applications, using simple index-based mapping algorithms, can be reduced to sorting a nearly sorted list of i ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
In this paper we present fast parallel algorithms for remapping a class of irregular and adaptive problems on coarse-grained distributed-memory machines. We show that the remapping of these applications, using simple index-based mapping algorithms, can be reduced to sorting a nearly sorted list of integers or merging an unsorted list of integers with a sorted list of integers. By using the algorithms we have developed, the remapping of these problems can be achieved at a fraction of the cost of mapping from scratch. Results of experiments performed on the CM-5 are presented.
Fast and Parallel Mapping Algorithms for Irregular Problems
- Journal of Supercomputing
, 1993
"... In this paper we develop simple index-based graph partitioning techniques. We show our methods to be very fast, easily parallelizable and that they produce good quality mappings. These properties make them useful for parallelization of a number of irregular and adaptive applications. Index Terms: M ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
In this paper we develop simple index-based graph partitioning techniques. We show our methods to be very fast, easily parallelizable and that they produce good quality mappings. These properties make them useful for parallelization of a number of irregular and adaptive applications. Index Terms: Mapping, Remapping, Parallel, Merging, Sorting 1 Introduction Parallelization of data-parallel programs on distributed-memory parallel computers requires careful attention to load balancing and reduction of communication to achieve a good performance. For most regular and synchronous problems [13], mapping can be performed at the time of compilation by giving directives to decompose the data and its corresponding computations [8]. For irregular applications, achieving a good mapping is considerably more difficult; the nature of the irregularities may not be known at the time of compilation and can be derived only at runtime [7]. These applications can be represented as computational graphs...

