Results 1 
8 of
8
Efficient LowContention Parallel Algorithms
, 1996
"... The queueread, queuewrite (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model re ects the contention propert ..."
Abstract

Cited by 34 (14 self)
 Add to MetaCart
The queueread, queuewrite (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model re ects the contention properties of most commercially available parallel machines more accurately than either the wellstudied crcw pram or erew pram models, and can be e ciently emulated with only logarithmic slowdown on hypercubetype noncombining networks. This paper describes fast, lowcontention, workoptimal, randomized qrqw pram algorithms for the fundamental problems of load balancing, multiple compaction, generating a random permutation, parallel hashing, and distributive sorting. These logarithmic or sublogarithmic time algorithms considerably improve upon the best known erew pram algorithms for these problems, while avoiding the highcontention steps typical of crcw pram algorithms. An illustrative experiment demonstrates the performance advantage of a new qrqw random permutation algorithm when compared with the popular erew algorithm. Finally, this paper presents new randomized algorithms for integer sorting and general sorting.
Waste makes haste: tight bounds for loose parallel sorting
 in Proc. 33th FOCS (IEEE, Los Alamitos, 1992) 628–637. Technical report TR MPII141, Max–Planck–Institut für Informatik (Saarbrücken
, 1992
"... Conventional parallel sorting requires the n input keys to be output in an array of size n, and is known to take fl(log n/log log n) time using any polynomial number of processors. The lower bound does not apply to the more "wasteful " convention of padded sorting, which requires the keys ..."
Abstract

Cited by 31 (6 self)
 Add to MetaCart
(Show Context)
Conventional parallel sorting requires the n input keys to be output in an array of size n, and is known to take fl(log n/log log n) time using any polynomial number of processors. The lower bound does not apply to the more "wasteful " convention of padded sorting, which requires the keys to be output in sorted order in an array of size (1 + o(1))n. We give very fast randomized CRCW PRAM algorithms for several paddedsorting problems. Applying only pairwise comparisons to the input and using kn processors, where 2:s; k:s; n, we can paddedsort n keys in O(logn/logk) time with high probability (whp), which is the best possible (expected) run time for any comparisonbased algorithm. We also show how to paddedsort n independent random numbers in O(log*n) time whp with O(n) work, which matches arecent lower bound, and how to paddedsort n integers in the range 1.. n in constant time whp using n processors. If the integer sorting is required to be stable, we can still solve the problem in o (log log n/log k) time whp using kn processors, for any k with 2:s; k:s; log n. The integer sorting results require the nonstandard OR PRAM; alternative implementations on standard PRAM variants run in O(log log n) time whp. As an application of our paddedsorting algorithms, we can solve approximate prefix summation problems of size n with O(n) work in constant time whp on the OR PRAM, and in O(loglog n) time whp on standard PRAM variants. 1
Ultrafast expected time parallel algorithms
 Proc. of the 2nd SODA
, 1991
"... It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of s ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
(Show Context)
It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of sorted items, a Padded Sort is often just as useful as an unpadded sort. Our algorithm for Padded Sort runs on the Tolerant CRCW PRAM and takes Θ(log log n/log log log n) expected time using n log log log n/log log n processors, assuming the items are taken from a uniform distribution. Using similar techniques we solve some computational geometry problems, including Voronoi Diagram, with the same processor and time bounds, assuming points are taken from a uniform distribution in the unit square. Further, we present an Arbitrary CRCW PRAM algorithm to solve the Closest Pair problem in constant expected time with n processors regardless of the distribution of points. All of these algorithms achieve linear speedup in expected time over their optimal serial counterparts. 1 Research done while at the University of Michigan and supported by an AT&T Fellowship.
On parallel integer sorting
 Acta Informatica
, 1992
"... Abstract. We present an optimal algorithm for sorting n integers in the range [1,n c] (for any constant c) fortheEREW PRAM model where the word length is n ɛ, for any ɛ>0.Using this algorithm, the best known upper bound for integer sorting on the (O(log n) word length) EREW PRAM model is improved ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
(Show Context)
Abstract. We present an optimal algorithm for sorting n integers in the range [1,n c] (for any constant c) fortheEREW PRAM model where the word length is n ɛ, for any ɛ>0.Using this algorithm, the best known upper bound for integer sorting on the (O(log n) word length) EREW PRAM model is improved. In addition, a novel parallel range reduction algorithm which results in a near optimal randomized integer sorting algorithm is presented. For the case when the keys are uniformly distributed integers in an arbitrary range, we give an algorithm whose expected running time is optimal.
ERCW PRAMs and Optical Communication
 in Proceedings of the European Conference on Parallel Processing, EUROPAR ’96
, 1996
"... This paper presents algorithms and lower bounds for several fundamental problems on the Exclusive Read, Concurrent Write Parallel Random Access Machine (ERCW PRAM) and some results for unbounded fanin, bounded fanout (or `BFO') circuits. Our results for these two models are of importance beca ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
This paper presents algorithms and lower bounds for several fundamental problems on the Exclusive Read, Concurrent Write Parallel Random Access Machine (ERCW PRAM) and some results for unbounded fanin, bounded fanout (or `BFO') circuits. Our results for these two models are of importance because of the close relationship of the ERCW model to the OCPC model, a model of parallel computing based on dynamically reconfigurable optical networks, and of BFO circuits to the OCPC model with limited dynamic reconfiguration ability. Topics: Parallel Algorithms, Theory of Parallel and Distributed Computing. This research was supported by Texas Advanced Research Projects Grant 003658480. (philmac@cs.utexas.edu) y This research was supported in part by Texas Advanced Research Projects Grants 003658480 and 003658386, and NSF Grant CCR 9023059. (vlr@cs.utexas.edu) 1 Introduction In this paper we develop algorithms and lower bounds for fundamental problems on the Exclusive Read Concurrent Wri...
Comments on Integer Sorting on SumCRCW
"... Abstract Given an array X of n elements from a restricted domain of integers [1, n]. The integer sorting problem is the rearrangement of n integers in ascending order. We study the first optimal deterministic sublogarithmic algorithm for integer sorting on CRCW PRAM. We give two comments on the alg ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Given an array X of n elements from a restricted domain of integers [1, n]. The integer sorting problem is the rearrangement of n integers in ascending order. We study the first optimal deterministic sublogarithmic algorithm for integer sorting on CRCW PRAM. We give two comments on the algorithm. The first comment is the algorithm not runs in sublogarithmic time for any distribution of input data. The second comment is the cost of the algorithm is not linear. Then, we modify the algorithm to be optimal in sense of cost with a restriction on the input data. Our modification algorithm has time complexity log n n log log n O ( log log n) using log n SumCRCW processors. Also, the algorithm has linear space. I.
A Note on Probabilistic Integer Sorting
"... We present a new probabilistic sequential algorithm for stable sorting n uniformly distributed keys in an arbitrary range. The algorithm runs in linear time with veryhigh probability 1 \Gamma 2 \Gamma\Omega\Gamma n) (the best previously known probability bound has been 1 \Gamma 2 \Gamma\Omega\Gamma ..."
Abstract
 Add to MetaCart
(Show Context)
We present a new probabilistic sequential algorithm for stable sorting n uniformly distributed keys in an arbitrary range. The algorithm runs in linear time with veryhigh probability 1 \Gamma 2 \Gamma\Omega\Gamma n) (the best previously known probability bound has been 1 \Gamma 2 \Gamma\Omega\Gamma n=(lg n lg lg n)) ). We also describe an EREW PRAM version of the algorithm that sorts in O((n=p + lg p) lg n= lg (n=p + lg n)) time using p n processors within the same probability bound. Additionally, we present experimental results for the sequential algorithm that establish the practicality of our algorithm.
Probabilistic Integer Sorting
"... We introduce a probabilistic sequential algorithm for stable sorting n uniformly distributed keys in an arbitrary range. The algorithm runs in linear time and sorts all but a very small fraction 2 # n) of the input sequences; the best previously known bound was 2 # n/(lg n lg lg n)) . An EREW ..."
Abstract
 Add to MetaCart
(Show Context)
We introduce a probabilistic sequential algorithm for stable sorting n uniformly distributed keys in an arbitrary range. The algorithm runs in linear time and sorts all but a very small fraction 2 # n) of the input sequences; the best previously known bound was 2 # n/(lg n lg lg n)) . An EREW PRAM version of the sequential algorithm sorts in O((n/p+lg p) lg n/ lg (n/p + lg n)) time using p # n processors under the same probabilistic conditions. For a CRCW PRAM we improve upon the probabilistic bound of 2 # n/(lg n lg lg n)) obtained by Rajasekaran and Sen to derive a 2 # n lg lg n/ lg n) bound. Two architecture independent parallel algorithms described under the framework of the BulkSynchronous Parallel model are also presented. For varying ratios of n/p they sort in optimal parallel computation time; the former algorithm sorts all but a 2 # n) fraction of the input sequences whereas the latter algorithm sorts all but a n #(1) fraction. Additionally, we present experi...