Results 1  10
of
15
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract

Cited by 193 (9 self)
 Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
Efficient LowContention Parallel Algorithms
 the 1994 ACM Symp. on Parallel Algorithms and Architectures
, 1994
"... The queueread, queuewrite (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model reflects the contention prope ..."
Abstract

Cited by 30 (11 self)
 Add to MetaCart
The queueread, queuewrite (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model reflects the contention properties of most commercially available parallel machines more accurately than either the wellstudied crcw pram or erew pram models, and can be efficiently emulated with only logarithmic slowdown on hypercubetype noncombining networks. This paper describes fast, lowcontention, workoptimal, randomized qrqw pram algorithms for the fundamental problems of load balancing, multiple compaction, generating a random permutation, parallel hashing, and distributive sorting. These logarithmic or sublogarithmic time algorithms considerably improve upon the best known erew pram algorithms for these problems, while avoiding the highcontention steps typical of crcw pram algorithms. An illustrative expe...
Sorting on a Parallel Pointer Machine with Applications to Set Expression Evaluation
 J. ACM
, 1989
"... We present optimal algorithms for sorting on parallel CREW and EREW versions of the pointer machine model. Intuitively, one can view our methods as being based on a parallel mergesort using linked lists rather than arrays (the usual parallel data structure). We also show how to exploit the "locality ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
We present optimal algorithms for sorting on parallel CREW and EREW versions of the pointer machine model. Intuitively, one can view our methods as being based on a parallel mergesort using linked lists rather than arrays (the usual parallel data structure). We also show how to exploit the "locality" of our approach to solve the set expression evaluation problem, a problem with applications to database querying and logicprogramming, in O(log n) time using O(n) processors. Interestingly, this is an asymptotic improvement over what seems possible using previous techniques. Categories and Subject Descriptors: E.1 [Data Structures]: arrays, lists; F.2.2. [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problemssorting and searching General Terms: Algorithms, Theory, Verification Additional Key Words and Phrases: parallel algorithms, PRAM, pointer machine, linking automaton, expression evaluation, mergesort, cascade merging 1 Introduction One of the primar...
An Optimal Randomized Logarithmic Time Connectivity Algorithm for the EREW PRAM
, 1996
"... Improving a long chain of works we obtain a randomised EREW PRAM algorithm for finding the connected components of a graph G = (V; E) with n vertices and m edges in O(logn) time using an optimal number of O((m + n)= log n) processors. The result returned by the algorithm is always correct. The pr ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Improving a long chain of works we obtain a randomised EREW PRAM algorithm for finding the connected components of a graph G = (V; E) with n vertices and m edges in O(logn) time using an optimal number of O((m + n)= log n) processors. The result returned by the algorithm is always correct. The probability that the algorithm will not complete in O(log n) time is o(n \Gammac ) for any c ? 0. 1 Introduction Finding the connected components of an undirected graph is perhaps the most basic algorithmic graph problem. While the problem is trivial in the sequential setting, it seems that elaborate methods should be used to solve the problem efficiently in the parallel setting. A considerable number of researchers investigated the complexity of the problem in various parallel models including, in particular, various members of the PRAM family. In this work we consider the EREW PRAM model, the weakest member of this family, and obtain, for the first time, a parallel connectivity algorith...
Randomized Parallel List Ranking For Distributed Memory Multiprocessors
, 1996
"... We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP like model. We first describe a simple version which requires, with high probability, log(3p) + log ln(n) = ~ O(logp+ log log n) communication rounds (hrelations with h = ~ O( n p )) and ~ O ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP like model. We first describe a simple version which requires, with high probability, log(3p) + log ln(n) = ~ O(logp+ log log n) communication rounds (hrelations with h = ~ O( n p )) and ~ O( n p ) local computation. We then outline an improved version which requires, with high probability, only r (4k + 6) log( 2 3 p) + 8 = ~ O(k log p) communication rounds where k = minfi 0j ln (i+1) n ( 2 3 p) 2i+1 g. Note that k ! ln (n) is an extremely small number. For n 10 10 100 and p 4, the value of k is at most 2. Hence, for a given number of processors, p, the number of communication rounds required is, for all practical purposes, independent of n. For n 1; 500; 000 and 4 p 2048, the number of communication rounds in our algorithm is bounded, with high probability, by 78, but the actual number of communication rounds observed so far is 25 in the worst case. Fo...
Optimal randomized EREW PRAM algorithms for finding spanning forests
 J. Algorithms
, 2000
"... We present the first randomized O(log n) time and O(m+n) work EREW PRAM algorithm for finding a spanning forest of an undirected graph G = (V; E) with n vertices and m edges. Our algorithm is optimal with respect to time, work and space. As a consequence we get optimal randomized EREW PRAM algori ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
We present the first randomized O(log n) time and O(m+n) work EREW PRAM algorithm for finding a spanning forest of an undirected graph G = (V; E) with n vertices and m edges. Our algorithm is optimal with respect to time, work and space. As a consequence we get optimal randomized EREW PRAM algorithms for other basic connectivity problems such as finding a bipartite partition, finding bridges and biconnected components, finding Euler tours in Eulerian graphs, finding an ear decomposition, finding an open ear decomposition, finding a strong orientation, and finding an stnumbering.
Nbody Simulation I: Fast Algorithms for Potential Field Evaluation and Trummer's Problem
, 1996
"... In this paper, we describe a new approximation algorithm for the nbody problem. The algorithm is a nontrivial modification of the fast multipole method that works in both two and three dimensions. Due to the equivalence between the twodimensional nbody problem and Trummer's problem, our algorith ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
In this paper, we describe a new approximation algorithm for the nbody problem. The algorithm is a nontrivial modification of the fast multipole method that works in both two and three dimensions. Due to the equivalence between the twodimensional nbody problem and Trummer's problem, our algorithm also gives the fastest known approximation algorithm for Trummer's problem. Let A be the sum of the absolute values of the particle charges in the nbody problem under consideration (or the sum of the masses if the simulation is gravitational). To approximate the particle potentials with error bound ffl, we let p = dlog(A=ffl)e and give complexity bounds in terms of p. Note that, under reasonable assumptions on the particle charges, if we desire the output to be accurate to b bits, then p = \Theta(b). In two dimensions, our algorithm runs in time O(n log 2 p), which is a substantial improvement over the previous best algorithm which requires \Theta(np log p) time. We also apply our new ...
Thinking in parallel: Some basic dataparallel algorithms and techniques
 In use as class notes since
, 1993
"... Copyright 19922009, Uzi Vishkin. These class notes reflect the theorertical part in the Parallel ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Copyright 19922009, Uzi Vishkin. These class notes reflect the theorertical part in the Parallel
Optimal Algorithms for the Single and Multiple Vertex Updating Problems of a Minimum Spanning Tree
 Algorithmica
, 1996
"... The vertex updating problem for a minimum spanning tree (MST) is defined as follows: Given a graph G = (V; EG ) and an MST T for G, find a new MST for G to which a new vertex z has been added along with weighted edges that connect z with the vertices of G. We present a set of rules that produce sim ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
The vertex updating problem for a minimum spanning tree (MST) is defined as follows: Given a graph G = (V; EG ) and an MST T for G, find a new MST for G to which a new vertex z has been added along with weighted edges that connect z with the vertices of G. We present a set of rules that produce simple optimal parallel algorithms that run in O(lg n) time using n= lg n EREW PRAM processors, where n = jV j. These algorithms employ any valid treecontraction schedule that can be produced within the stated resource bounds. These rules can also be used to derive simple lineartime sequential algorithms for the same problem. The previously best known parallel result was a rather complicated algorithm that used n processors in the more powerful CREW PRAM model. Furthermore, we show how our solution can be used to solve the multiple vertex updating problem: Update a given MST when k new vertices are introduced simultaneously. This problem is solved in O(lg k \Delta lg n) parallel time using ...
A Contraction Procedure for Planar Directed Graphs
 Proc. 4th Annual ACM Symposium on Parallel Algorithms and Architectures
, 1992
"... We show that testing reachability in a planar DAG can be performed in parallel in O(log n log n) time (O(log n) time using randomization) using O(n) processors. In general we give a paradigm for contracting a planar DAG to a point and then expanding it back. This paradigm is developed from a prop ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We show that testing reachability in a planar DAG can be performed in parallel in O(log n log n) time (O(log n) time using randomization) using O(n) processors. In general we give a paradigm for contracting a planar DAG to a point and then expanding it back. This paradigm is developed from a property of planar directed graphs we refer to as the Poincar'e index formula. Using this new paradigm we then "overlay" our application in a fashion similar to parallel tree contraction [MR85, MR89]. We also discuss some of the changes needed to extend the reduction procedure to work for general planar digraphs. Using the stronglyconnected components algorithm of Kao [Kao91] we can compute multiplesource reachability for general planar digraphs in O(log 3 n) time using O(n) processors. This improves the results of Kao and Klein [KK90] who showed that this problem could be performed in O(log 5 n) time using O(n) processors. This work represents initial results of an effort to develop effi...