Results 1  10
of
26
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract

Cited by 193 (9 self)
 Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
merging, and sorting in parallel models of computation
 in “Proc. 14th Annual ACM Sympos. on Theory of Cornput
, 1982
"... A variety of models have been proposed for the study of synchronous parallel computation. These models are reviewed and some prototype problems are studied further. Two classes of models are recognized, fixed connection networks and models based on a shared memory. Routing and sorting are prototype ..."
Abstract

Cited by 105 (3 self)
 Add to MetaCart
A variety of models have been proposed for the study of synchronous parallel computation. These models are reviewed and some prototype problems are studied further. Two classes of models are recognized, fixed connection networks and models based on a shared memory. Routing and sorting are prototype problems for the networks; in particular, they provide the basis for simulating the more powerful shared memory models. It is shown that a simple but important class of deterministic strategies (oblivious routing) is necessarily inefficient with respect to worst case analysis. Routing can be viewed as a special case of sorting, and the existence of an O(log n) sorting algorithm for some n processor fixed connection network has only recently been established by Ajtai, Komlos, and Szemeredi (“15th ACM Sympos. on Theory of Cornput., ” Boston, Mass., 1983, pp. l9). If the more powerful class of shared memory models is considered then it is possible to simply achieve an O(log n loglog n) sort via Valiant’s parallel merging algorithm, which it is shown can be implemented on certain models. Within a spectrum of shared memory models, it is shown that loglogn is asymptotically optimal for n processors to merge two sorted lists containing n elements. 0 1985 Academic Press, Inc.
Improved Parallel Integer Sorting without Concurrent Writing
, 1992
"... We show that n integers in the range 1 : : n can be sorted stably on an EREW PRAM using O(t) time and O(n( p log n log log n + (log n) 2 =t)) operations, for arbitrary given t log n log log n, and on a CREW PRAM using O(t) time and O(n( p log n + log n=2 t=logn )) operations, for arbitrary ..."
Abstract

Cited by 41 (4 self)
 Add to MetaCart
We show that n integers in the range 1 : : n can be sorted stably on an EREW PRAM using O(t) time and O(n( p log n log log n + (log n) 2 =t)) operations, for arbitrary given t log n log log n, and on a CREW PRAM using O(t) time and O(n( p log n + log n=2 t=logn )) operations, for arbitrary given t log n. In addition, we are able to sort n arbitrary integers on a randomized CREW PRAM within the same resource bounds with high probability. In each case our algorithm is a factor of almost \Theta( p log n) closer to optimality than all previous algorithms for the stated problem in the stated model, and our third result matches the operation count of the best previous sequential algorithm. We also show that n integers in the range 1 : : m can be sorted in O((log n) 2 ) time with O(n) operations on an EREW PRAM using a nonstandard word length of O(log n log log n log m) bits, thereby greatly improving the upper bound on the word length necessary to sort integers with a linear t...
An optimal O(log log n) time parallel string matching algorithm
 SIAM J. COMPUT
, 1990
"... An optimal O(log log n) time parallel algorithm for string matching on CRCWPRAM is presented. It improves previous results of [G] and [V]. ..."
Abstract

Cited by 27 (11 self)
 Add to MetaCart
An optimal O(log log n) time parallel algorithm for string matching on CRCWPRAM is presented. It improves previous results of [G] and [V].
On the Complexity of Finding the Chromatic Number of a Recursive Graph I: The Bounded Case
 Annals of Pure and Applied Logic
, 1989
"... We classify functions in recursive graph theory in terms of how many queries to K (or # ## or # ### ) are required to compute them. We show that (1) binary search is optimal (in terms of the number of queries to K) for finding the chromatic number of a recursive graph and that no set of Turing d ..."
Abstract

Cited by 17 (10 self)
 Add to MetaCart
We classify functions in recursive graph theory in terms of how many queries to K (or # ## or # ### ) are required to compute them. We show that (1) binary search is optimal (in terms of the number of queries to K) for finding the chromatic number of a recursive graph and that no set of Turing degree less than 0 # will su#ce, (2) determining if a recursive graph has a finite chromatic number is # 2 complete, and (3) binary search is optimal (in terms of the number of queries to # ### ) for finding the recursive chromatic number of a recursive graph and that no set of Turing degree less than 0 ### will su#ce. We also explore how much help queries to a weaker set may provide. Some of our results have analogues in terms of asking p questions at a time, but some do not. In particular, (p + 1)ary search is not always optimal for finding the chromatic number of a recursive graph. Most of our results are also true for highly recursive graphs, though there are some interesting di#erenc...
Parallelism and Locality in Priority Queues
 In Sixth IEEE Sypmposium on Parallel and Distributed Processing
, 1994
"... We explore two ways of incorporating parallelism into priority queues. The first is to speed up the execution of individual priority operations so that they can be performed one operation per time step, unlike sequential implementations which require O(log N ) time steps per operation for an N eleme ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
We explore two ways of incorporating parallelism into priority queues. The first is to speed up the execution of individual priority operations so that they can be performed one operation per time step, unlike sequential implementations which require O(log N ) time steps per operation for an N element heap. We give an optimal parallel implementation that uses a linear array of O(log N ) processors. Second, we consider parallel operations on the priority queue. We show that using a ddimensional array (constant d) of P processors we can insert or delete the smallest P elements from a heap in time O(P 1=d log 1\Gamma1=d P ), where the number of elements in the heap is assumed to be polynomial in P . We also show a matching lower bound, based on communication complexity arguments, for a range of deterministic implementations. Finally, using randomization, we show that the time can be reduced to the optimal O(P 1=d ) time with high probability. 1 Introduction Much of the theoret...
Tight Comparison Bounds On The Complexity Of Parallel Sorting
, 1987
"... The problem of sorting n elements using p processors in a parallel comparison model is considered. Lower and upper bounds which imply that for p ³ n, the time complexity of this problem is Q( log(1 + p / n) logn ___________ ) are presented. This complements [AKS83] in settling the problem since ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
The problem of sorting n elements using p processors in a parallel comparison model is considered. Lower and upper bounds which imply that for p ³ n, the time complexity of this problem is Q( log(1 + p / n) logn ___________ ) are presented. This complements [AKS83] in settling the problem since the AKS sorting network established that for pn the time complexity is Q( p nlogn ______ ). To prove the lower bounds we show that to achieve k logn parallel time, we need W(n 1 + 1/k ) processors. 1. Introduction Apparently, there is no problem in Computer Science which received more attention than sorting. [Kn73], for instance, found that existing computers devote approximately a quarter of their time to sorting. The advent of parallel computers stimulated intensive research of the sorting with respect to various models of parallel computation. Extensive lists of references which recorded this activity are given in [Ak85], [BHe86] and [Th83]. Most of the fastest serial and paral...
Sample Sort on Meshes
 Techn. Rep. MPII951012, MaxPlanck Institut fur Informatik
, 1997
"... In this paper various algorithms for sorting on processor networks are considered. We focus on meshes, but the results can be generalized easily to other decomposable architectures. We consider the kk sorting problem in which every PU initially holds k packets. We present wellknown randomized and ..."
Abstract

Cited by 8 (7 self)
 Add to MetaCart
In this paper various algorithms for sorting on processor networks are considered. We focus on meshes, but the results can be generalized easily to other decomposable architectures. We consider the kk sorting problem in which every PU initially holds k packets. We present wellknown randomized and deterministic splitterbased sorting algorithms. We come with a new deterministic sorting algorithm which performs much better than previous ones. The number of routing steps is reduced by a refined deterministic splitter selection. Hereby deterministic sorting might become competitive with randomized sorting in practice. 1 Introduction 1.1 Problem and Machine Meshes. One of the most thoroughly investigated interconnection schemes for parallel computation is the n \Theta n mesh, in which n 2 processing units, PUs, are connected by a twodimensional grid of communication links. Its immediate generalizations are ddimensional n \Theta \Delta \Delta \Delta \Theta n meshes. While meshes have ...
Efficient String Algorithmics
, 1992
"... Problems involving strings arise in many areas of computer science and have numerous practical applications. We consider several problems from a theoretical perspective and provide efficient algorithms and lower bounds for these problems in sequential and parallel models of computation. In the sequ ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
Problems involving strings arise in many areas of computer science and have numerous practical applications. We consider several problems from a theoretical perspective and provide efficient algorithms and lower bounds for these problems in sequential and parallel models of computation. In the sequential setting, we present new algorithms for the string matching problem improving the previous bounds on the number of comparisons performed by such algorithms. In parallel computation, we present tight algorithms and lower bounds for the string matching problem, for finding the periods of a string, for detecting squares and for finding initial palindromes.