Results 1 -
4 of
4
Efficient Low-Contention Parallel Algorithms
- the 1994 ACM Symp. on Parallel Algorithms and Architectures
, 1994
"... The queue-read, queue-write (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model reflects the contention prope ..."
Abstract
-
Cited by 29 (11 self)
- Add to MetaCart
The queue-read, queue-write (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model reflects the contention properties of most commercially available parallel machines more accurately than either the well-studied crcw pram or erew pram models, and can be efficiently emulated with only logarithmic slowdown on hypercubetype non-combining networks. This paper describes fast, low-contention, work-optimal, randomized qrqw pram algorithms for the fundamental problems of load balancing, multiple compaction, generating a random permutation, parallel hashing, and distributive sorting. These logarithmic or sublogarithmic time algorithms considerably improve upon the best known erew pram algorithms for these problems, while avoiding the high-contention steps typical of crcw pram algorithms. An illustrative expe...
Ultra-fast expected time parallel algorithms
- Proc. of the 2nd SODA
, 1991
"... It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of s ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of sorted items, a Padded Sort is often just as useful as an unpadded sort. Our algorithm for Padded Sort runs on the Tolerant CRCW PRAM and takes Θ(log log n/log log log n) expected time using n log log log n/log log n processors, assuming the items are taken from a uniform distribution. Using similar techniques we solve some computational geometry problems, including Voronoi Diagram, with the same processor and time bounds, assuming points are taken from a uniform distribution in the unit square. Further, we present an Arbitrary CRCW PRAM algorithm to solve the Closest Pair problem in constant expected time with n processors regardless of the distribution of points. All of these algorithms achieve linear speedup in expected time over their optimal serial counterparts. 1 Research done while at the University of Michigan and supported by an AT&T Fellowship.
On parallel integer sorting
- Acta Informatica
, 1992
"... Abstract. We present an optimal algorithm for sorting n integers in the range [1,n c] (for any constant c) fortheEREW PRAM model where the word length is n ɛ, for any ɛ>0.Using this algorithm, the best known upper bound for integer sorting on the (O(log n) word length) EREW PRAM model is improved. I ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
Abstract. We present an optimal algorithm for sorting n integers in the range [1,n c] (for any constant c) fortheEREW PRAM model where the word length is n ɛ, for any ɛ>0.Using this algorithm, the best known upper bound for integer sorting on the (O(log n) word length) EREW PRAM model is improved. In addition, a novel parallel range reduction algorithm which results in a near optimal randomized integer sorting algorithm is presented. For the case when the keys are uniformly distributed integers in an arbitrary range, we give an algorithm whose expected running time is optimal.
ERCW PRAMs and Optical Communication
- in Proceedings of the European Conference on Parallel Processing, EUROPAR ’96
, 1996
"... This paper presents algorithms and lower bounds for several fundamental problems on the Exclusive Read, Concurrent Write Parallel Random Access Machine (ERCW PRAM) and some results for unbounded fan-in, bounded fan-out (or `BFO') circuits. Our results for these two models are of importance because o ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper presents algorithms and lower bounds for several fundamental problems on the Exclusive Read, Concurrent Write Parallel Random Access Machine (ERCW PRAM) and some results for unbounded fan-in, bounded fan-out (or `BFO') circuits. Our results for these two models are of importance because of the close relationship of the ERCW model to the OCPC model, a model of parallel computing based on dynamically reconfigurable optical networks, and of BFO circuits to the OCPC model with limited dynamic reconfiguration ability. Topics: Parallel Algorithms, Theory of Parallel and Distributed Computing. This research was supported by Texas Advanced Research Projects Grant 003658480. (philmac@cs.utexas.edu) y This research was supported in part by Texas Advanced Research Projects Grants 003658480 and 003658386, and NSF Grant CCR 90-23059. (vlr@cs.utexas.edu) 1 Introduction In this paper we develop algorithms and lower bounds for fundamental problems on the Exclusive Read Concurrent Wri...

