Results 1 
8 of
8
Doubly Logarithmic Communication Algorithms for Optical Communication Parallel Computers
 In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures
, 1994
"... In this paper we consider the problem of interprocessor communication on parallel computers that have optical communication networks. We consider the Completely Connected Optical Communication Parallel Computer (OCPC), which has a completely connected optical network and also the Mesh of Optical Bus ..."
Abstract

Cited by 39 (5 self)
 Add to MetaCart
In this paper we consider the problem of interprocessor communication on parallel computers that have optical communication networks. We consider the Completely Connected Optical Communication Parallel Computer (OCPC), which has a completely connected optical network and also the Mesh of Optical Buses Parallel Computer (MOBPC) , which has a mesh of optical buses as its communication network. The particular communication problem that we study is that of realizing an hrelation. In this problem, each processor has at most h messages to send and at most h messages to receive. It is clear that any 1relation can be realized in one communication step on an OCPC. However, the best previously known pprocessor OCPC algorithm for realizing an arbitrary hrelation for h ? 1 requires \Theta(h + log p) expected communication steps. (This algorithm is due to Valiant and is based on earlier work of Anderson and Miller.) Valiant's algorithm is optimal only for h = \Omega\Gamma139 p) and it is an op...
The Random Adversary: A LowerBound Technique For Randomized Parallel Algorithms
 in Proc. of the 3rd SODA (ACM
, 1997
"... . The randomadversary technique is a general method for proving lower bounds on randomized parallel algorithms. The bounds apply to the number of communication steps, and they apply regardless of the processors' instruction sets, the lengths of messages, etc. This paper introduces the ra ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
.<F3.82e+05> The randomadversary technique is a general method for proving lower bounds on randomized parallel algorithms. The bounds apply to the number of communication steps, and they apply regardless of the processors' instruction sets, the lengths of messages, etc. This paper introduces the randomadversary technique and shows how it can be used to obtain lower bounds on randomized parallel algorithms for load balancing, compaction, padded sorting, and finding Hamiltonian cycles in random graphs. Using the randomadversary technique, we obtain the first lower bounds for randomized parallel algorithms which are provably faster than their deterministic counterparts (specifically, for load balancing and related problems).<F4.005e+05> Key words.<F3.82e+05> parallel algorithms, parallel computation, PRAM model, randomized parallel algorithms, expected time, lower bounds, load balancing<F4.005e+05> AMS subject classifications.<F3.82e+05> 68Q10, 68Q22, 68Q25<F4.005e+05> PII.<F3.82e+05> ...
ERCW PRAMs and Optical Communication
 in Proceedings of the European Conference on Parallel Processing, EUROPAR ’96
, 1996
"... This paper presents algorithms and lower bounds for several fundamental problems on the Exclusive Read, Concurrent Write Parallel Random Access Machine (ERCW PRAM) and some results for unbounded fanin, bounded fanout (or `BFO') circuits. Our results for these two models are of importance because o ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper presents algorithms and lower bounds for several fundamental problems on the Exclusive Read, Concurrent Write Parallel Random Access Machine (ERCW PRAM) and some results for unbounded fanin, bounded fanout (or `BFO') circuits. Our results for these two models are of importance because of the close relationship of the ERCW model to the OCPC model, a model of parallel computing based on dynamically reconfigurable optical networks, and of BFO circuits to the OCPC model with limited dynamic reconfiguration ability. Topics: Parallel Algorithms, Theory of Parallel and Distributed Computing. This research was supported by Texas Advanced Research Projects Grant 003658480. (philmac@cs.utexas.edu) y This research was supported in part by Texas Advanced Research Projects Grants 003658480 and 003658386, and NSF Grant CCR 9023059. (vlr@cs.utexas.edu) 1 Introduction In this paper we develop algorithms and lower bounds for fundamental problems on the Exclusive Read Concurrent Wri...
Simple Fast Parallel Hashing by Oblivious Execution
 AT&T Bell Laboratories
, 1994
"... A hash table is a representation of a set in a linear size data structure that supports constanttime membership queries. We show how to construct a hash table for any given set of n keys in O(lg lg n) parallel time with high probability, using n processors on a weak version of a crcw pram. Our algo ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
A hash table is a representation of a set in a linear size data structure that supports constanttime membership queries. We show how to construct a hash table for any given set of n keys in O(lg lg n) parallel time with high probability, using n processors on a weak version of a crcw pram. Our algorithm uses a novel approach of hashing by "oblivious execution" based on probabilistic analysis to circumvent the parity lower bound barrier at the nearlogarithmic time level. The algorithm is simple and is sketched by the following: 1. Partition the input set into buckets by a random polynomial of constant degree. 2. For t := 1 to O(lg lg n) do (a) Allocate M t memory blocks, each of size K t . (b) Let each bucket select a block at random, and try to injectively map its keys into the block using a random linear function. Buckets that fail carry on to the next iteration. The crux of the algorithm is a careful a priori selection of the parameters M t and K t . The algorithm uses only O(lg lg...
Uniform Circuits and Exclusive Read PRAMs
 In Proc. of the 11th FST&TCS, number 560 in LNCS
, 1991
"... CRCWPRAMs can be characterized in terms of unbounded fanin circuits. We introduce the notion of SELECTgates. Combining this with the concept of an unambiguous circuit we are able to give a circuit equivalent of EREWPRAMs, thus answering an open question of [SV84]. Moreover, circuits with SELE ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
CRCWPRAMs can be characterized in terms of unbounded fanin circuits. We introduce the notion of SELECTgates. Combining this with the concept of an unambiguous circuit we are able to give a circuit equivalent of EREWPRAMs, thus answering an open question of [SV84]. Moreover, circuits with SELECTgates characterize CRCW, CREW, ERCW, and EREWPRAMs in a uniform manner. Introduction Parallel random access machines (PRAMs) and uniform circuit families are very important models in parallel complexity theory. Other important models are alternating Turing machines and auxiliary pushdown automata (see e.g. [Ruz80, Coo85]). NC , the class of efficiently parallel solvable problems, can be characterized in terms of all these models. For the design of efficient parallel algorithms, PRAMs are the most favorite model, while circuits are often used to achieve more theoretical results, such as lower bounds (see e.g. [Has86]). NC is in fact a hierarchy of classes that contains in particula...
Doubly Logarithmic Time Parallel Sorting
"... Recently, attempts have been made to separate the problem of parallel sorting from that of list ranking, in order to get around the well known\Omega\Gamma/33 n= log log n) lower bound. These approaches have been of two kinds  chain sorting and padded sorting. Here we present nearly optimal, comp ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Recently, attempts have been made to separate the problem of parallel sorting from that of list ranking, in order to get around the well known\Omega\Gamma/33 n= log log n) lower bound. These approaches have been of two kinds  chain sorting and padded sorting. Here we present nearly optimal, comparison based padded sorting algorithms that run in average case time O( 1 ffl 2 + 1 ffl log log n) using n 1+ffl processors, and O(n 1+ffl ) space, on an Common CRCW PRAM.From these results, algorithms for chain sorting within the same time and processor bounds can be easily obtained. Using a similar approach, we also give an O(1) average case time, comparison based algorithm for finding the largest of n items using a linear number of processors. The algorithm for finding the maximum, runs on a Common CRCW PRAM using only n 3=4 cells of shared memory. Finally, we obtain randomised algorithms for these problems that run on Common/Tolerant CRCW PRAMs, and also satisfy the above...
An Optical Simulation of Shared Memory
 In Proceedings of the 6th Annual ACM Symposium on Parallel Algorithms and Architectures
, 1994
"... We present a workoptimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module ..."
Abstract
 Add to MetaCart
We present a workoptimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module per processor. Each memory module only services a request on a timestep if it receives exactly one memory request. Our algorithm simulates each step of an n lg lg nprocessor erew pram on an nprocessor ocpc in O(lg lg n) expected delay. (The probability that the delay is longer than this is at most n \Gammaff for any constant ff.) The best previous simulation, due to Valiant, required \Theta(lg n) expected delay. 1 Introduction The huge bandwidth of the optical medium makes it possible to use optics to build communication networks of very high degree. Eshaghian [8, 9] first studied the computational aspects of parallel architectures with complete optical interconnection networks. The ...
SIMPLE AND WORKEFFICIENT PARALLEL ALGORITHMS FOR THE MINIMUM SPANNING TREE PROBLEM
 PARALLEL PROCESSING LETTERS
"... Two simple and workefficient parallel algorithms for the minimum spanning tree problem are presented. Both algorithms perform O(m log n) work. The first algorithm runs in O(log² n) time on an EREW PRAM, while the second algorithm runs in O(log n) time on a Common CRCW PRAM. ..."
Abstract
 Add to MetaCart
Two simple and workefficient parallel algorithms for the minimum spanning tree problem are presented. Both algorithms perform O(m log n) work. The first algorithm runs in O(log² n) time on an EREW PRAM, while the second algorithm runs in O(log n) time on a Common CRCW PRAM.