Results 1 -
2 of
2
Sorting Large Data Sets on a Massively Parallel System
- In Proceedings of the 6th IEEE Symposium on Parallel and Distributed Processing (SPDP
, 1994
"... This paper presents a performance study for many of today's popular parallel sorting algorithms. It is the first to present a comparative study on a large scale MIMD system. The machine, a Parsytec GCel, contains 1024 processors connected as a two-dimensional grid. To justify the experimental result ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
This paper presents a performance study for many of today's popular parallel sorting algorithms. It is the first to present a comparative study on a large scale MIMD system. The machine, a Parsytec GCel, contains 1024 processors connected as a two-dimensional grid. To justify the experimental results, we develop a theoretical model to predict the performance in terms of communication and computation times. We get a very close relation between the experiments and the theoretical model as long as the edge congestion caused by the algorithms is predicted precisely. We compare: Bitonicsort, Shearsort, Gridsort, Samplesort, and Radixsort. Experiments were performed using random instances according to a well known benchmark problem. Results show that for the machine we used, Bitonicsort performs best for smaller numbers of keys per processor (! 2048) and Samplesort outperforms all other methods for larger instances. 1 Introduction The problem. One of the fundamental problems in computer sci...
The Design and Analysis of Bulk-Synchronous Parallel Algorithms
, 1998
"... The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. This thesis presents a systematic approach to the design and analysis of BSP algorithms. We introduce an extension of the BSP model, called BSPRAM, which reconciles shared-memory s ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. This thesis presents a systematic approach to the design and analysis of BSP algorithms. We introduce an extension of the BSP model, called BSPRAM, which reconciles shared-memory style programming with efficient exploitation of data locality. The BSPRAM model can be optimally simulated by a BSP computer for a broad range of algorithms possessing certain characteristic properties: obliviousness, slackness, granularity. We use BSPRAM to design BSP algorithms for problems from three large, partially overlapping domains: combinatorial computation, dense matrix computation, graph computation. Some of the presented algorithms are adapted from known BSP algorithms (butterfly dag computation, cube dag computation, matrix multiplication). Other algorithms are obtained by application of established non-BSP techniques (sorting, randomised list contraction, Gaussian elimination without pivoting and with column pivoting, algebraic path computation), or use original techniques specific to the BSP model (deterministic list contraction, Gaussian elimination with nested block pivoting, communication-efficient multiplication of Boolean matrices, synchronisation-efficient shortest paths computation). The asymptotic BSP cost of each algorithm is established, along with its BSPRAM characteristics. We conclude by outlining some directions for future research.

