Parallel Algorithmic Techniques for Combinatorial Computation
 Ann. Rev. Comput. Sci
, 1988
this paper and supplied many helpful comments. This research was supported in part by NSF grants DCR8511713, CCR8605353, and CCR8814977, and by DARPA contract N0003984C0165.
On Parallel Hashing and Integer Sorting
, 1991
"... The problem of sorting n integers from a restricted range [1::m], where m is superpolynomial in n, is considered. An o(n log n) randomized algorithm is given. Our algorithm takes O(n log log m) expected time and O(n) space. (Thus, for m = n polylog(n) we have an O(n log log n) algorithm.) The al ..."
The problem of sorting n integers from a restricted range [1::m], where m is superpolynomial in n, is considered. An o(n log n) randomized algorithm is given. Our algorithm takes O(n log log m) expected time and O(n) space. (Thus, for m = n polylog(n) we have an O(n log log n) algorithm.) The algorithm is parallelizable. The resulting parallel algorithm achieves optimal speed up. Some features of the algorithm make us believe that it is relevant for practical applications. A result of independent interest is a parallel hashing technique. The expected construction time is logarithmic using an optimal number of processors, and searching for a value takes O(1) time in the worst case. This technique enables drastic reduction of space requirements for the price of using randomness. Applicability of the technique is demonstrated for the parallel sorting algorithm, and for some parallel string matching algorithms. The parallel sorting algorithm is designed for a strong and non standard mo...
Simulation of PRAM Models on Meshes
 Nordic Journal on Computing, 2(1):51
, 1994
"... We analyze the complexity of simulating a PRAM (parallel random access machine) on a mesh structured distributed memory machine. By utilizing suitable algorithms for randomized hashing, routing in a mesh, and sorting in a mesh, we prove that simulation of a PRAM on p N \Theta p N (or 3 p N \The ..."
We analyze the complexity of simulating a PRAM (parallel random access machine) on a mesh structured distributed memory machine. By utilizing suitable algorithms for randomized hashing, routing in a mesh, and sorting in a mesh, we prove that simulation of a PRAM on p N \Theta p N (or 3 p N \Theta 3 p N \Theta 3 p N ) mesh is possible with O( p N ) (respectively O( 3 p N )) delay with high probability and a relatively small constant. Furthermore, with more sophisticated simulations further speedups are achieved; experiments show delays as low as p N + o( p N ) (respectively 3 p N + o( 3 p N )) per N PRAM processors. These simulations compare quite favorably with PRAM simulations on butterfly and hypercube. 1 Introduction PRAM 1 (Parallel Random Access Machine) is an abstract model of computation. It consists of N processors, each of which may have some local memory and registers, and a global shared memory of size m. A step of a PRAM is often seen to consist of...
Some Topics in Parallel Computation and Branching Programs
, 1995
"... Some Topics in Parallel Computation and Branching Programs by Rakesh Kumar Sinha Chairperson of the Supervisory Committee: Professor Paul Beame Department of Computer Science and Engineering There are two parts of this thesis: the first part gives two constructions of branching programs; the second ..."
Some Topics in Parallel Computation and Branching Programs by Rakesh Kumar Sinha Chairperson of the Supervisory Committee: Professor Paul Beame Department of Computer Science and Engineering There are two parts of this thesis: the first part gives two constructions of branching programs; the second part contains three results on models of parallel machines. The branching program model has turned out to be very useful for understanding the computational behavior of problems. In addition, several restrictions of branching programs, for example ordered binary decision diagrams, have proven to be successful data structures in several VLSI design and verification applications. We construct a branching program of o(n log 3 n) nodes for computing any threshold function on n variables and a branching program of o(n log 4 n) nodes for determining the sum of n variables modulo a fixed divisor. These are improvements over constructions of size 2(n 3=2 ) due to Lupanov [Lup65]. The second p...
Removing Ramsey Theory: Lower Bounds With Smaller Domain Size
 Theoret. Comput. Sci
"... : Boppana [B89] proves a lower bound separating the PRIORITY and the COMMON PRAM models that is optimal to within a constant factor. However, an essential ingredient in his proof is a problem with an enormously large input domain. In this paper, I achieve the same lower bound with the improvement th ..."
: Boppana [B89] proves a lower bound separating the PRIORITY and the COMMON PRAM models that is optimal to within a constant factor. However, an essential ingredient in his proof is a problem with an enormously large input domain. In this paper, I achieve the same lower bound with the improvement that it applies even when the computational problem is defined on a much more reasonably sized input domain. My new techniques provide a greater understanding of the partial information a processor learns about the input. In addition, I define a new measure of the dependency that a function has on a variable and develop new set theoretic techniques to replace the use of Ramsey theory (which had forced the domain size to be large). 1 Introduction Ramsey Theory has been extremely useful in proving lower bounds for problems defined on huge input domains. (e.g, [B89]). Given a fixed algorithm, the input domain is restricted so that the given algorithm, when run on the restricted domain, falls wit...