Results 1 - 10
of
12
A simple linear time algorithm for computing a (2k − 1)-spanner of O(n 1+1/k ) size in weighted graphs
- In Proceedings of the 30th International Colloquium on Automata, Languages and Programming
, 2003
"... ) edges are required in the worst case for any (2k \Gamma 1)-spanner, which has been proved for k = 1; 2; 3; 5. There exist polynomial time algorithms that can construct spanners with the size that matches this conjectured lower bound, and the best known algorithm takes O(mn 1=k) expected running ti ..."
Abstract
-
Cited by 28 (5 self)
- Add to MetaCart
) edges are required in the worst case for any (2k \Gamma 1)-spanner, which has been proved for k = 1; 2; 3; 5. There exist polynomial time algorithms that can construct spanners with the size that matches this conjectured lower bound, and the best known algorithm takes O(mn 1=k) expected running time. In this paper, we present an extremely simple linear time randomized algorithm that computes a (2k \Gamma 1)-spanner of size matching the conjectured lower bound. An important feature of our algorithm is its local approach. Unlike all the previous algorithms which require computation of shortest paths, the new algorithm merely explores the edges in the neighborhood of a vertex or a group of vertices. This feature leads to designing simple external-memory and parallel algorithms for computing sparse spanners, whose running times are optimal up to logarithmic factors.
Efficient PRAM Simulation on a Distributed Memory Machine
- IN PROCEEDINGS OF THE TWENTY-FOURTH ACM SYMPOSIUM ON THEORY OF COMPUTING
, 1992
"... We present algorithms for the randomized simulation of a shared memory machine (PRAM) on a Distributed Memory Machine (DMM). In a PRAM, memory conflicts occur only through concurrent access to the same cell, whereas the memory of a DMM is divided into modules, one for each processor, and concurrent ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
We present algorithms for the randomized simulation of a shared memory machine (PRAM) on a Distributed Memory Machine (DMM). In a PRAM, memory conflicts occur only through concurrent access to the same cell, whereas the memory of a DMM is divided into modules, one for each processor, and concurrent accesses to the same module create a conflict. The delay of a simulation is the time needed to simulate a parallel memory access of the PRAM. Any general simulation of an m processor PRAM on a n processor DMM will necessarily have delay at least m=n. A randomized simulation is called time-processor optimal if the delay is O(m=n) with high probability. Using a novel simulation scheme based on hashing we obtain a time-processor optimal simulation with delay O(loglog(n)log (n)). The best previous simulations use a simpler scheme based on hashing and have much larger delay: \Theta(log(n)= loglog(n)) for the simulation of an n processor PRAM on an n processor DMM, and \Theta(log(n)) in the case ...
Parallel Shortest Path for Arbitrary Graphs
- In EUROPAR: Parallel Processing, 6th International EURO-PAR Conference. LNCS
, 2000
"... . In spite of intensive research, no work-ecient parallel algorithm for the single source shortest path problem is known which works in sublinear time for arbitrary directed graphs with non-negative edge weights. We present an algorithm that improves this situation for graphs where the ratio dc= ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
. In spite of intensive research, no work-ecient parallel algorithm for the single source shortest path problem is known which works in sublinear time for arbitrary directed graphs with non-negative edge weights. We present an algorithm that improves this situation for graphs where the ratio dc= between the maximum weight of a shortest path dc and a \safe step width" is not too large. We show how such a step width can be found eciently and give several graph classes which meet the above condition, such that our parallel shortest path algorithm runs in sublinear time and uses linear work. The new algorithm is even faster than a previous one which only works for random graphs with random edge weights [10]. On those graphs our new approach is faster by a factor of (log n= log log n) and achieves an expected time bound of O(log 2 n) using linear work. 1 Introduction The single source shortest path problem (SSSP) is a fundamental and well-studied combinatorial optimizati...
Contention Resolution in Hashing Based Shared Memory Simulations
"... In this paper we study the problem of simulating shared memory on the Distributed Memory Machine (DMM). Our approach uses multiple copies of shared memory cells, distributed among the memory modules of the DMM via universal hashing. Thus the main problem is to design strategies that resolve cont ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
In this paper we study the problem of simulating shared memory on the Distributed Memory Machine (DMM). Our approach uses multiple copies of shared memory cells, distributed among the memory modules of the DMM via universal hashing. Thus the main problem is to design strategies that resolve contention at the memory modules. Developing ideas from random graphs and very fast randomized algorithms, we present new simulation techniques that enable us to improve the previously best results exponentially. Particularly, we show that an n-processor CRCW PRAM can be simulated by an n-processor DMM with delay O(log log log n log n), with high probability. Next we show a general technique that can be used to turn these simulations to time-processor optimal ones, in the case of EREW PRAMs to be simulated. We obtain a time-processor optimal simulation of an (n log log log n log n)-processor EREW PRAM on an n-processor DMM with O(log log log n log n) delay. When a CRCW PRAM with (n...
Translating a planar object to maximize point containment
- In Proc. 10th Annu. European Sympos. Algorithms, Lecture Notes Comput. Sci
, 2002
"... Abstract. Let C be a compact set in R ..."
Retrieval of scattered information by EREW, CREW, and CRCW PRAMs
- In Proc. 3rd Scand. Workshop on Alg. Theory
, 1992
"... Abstract. The k-compaction problem arises when k out of n cells in an array are non-empty and the contents of these cells must be moved to the first k locations in the array. Parallel algorithms for k-compaction have obvious applications in processor allocation and load balancing; k-compaction is al ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. The k-compaction problem arises when k out of n cells in an array are non-empty and the contents of these cells must be moved to the first k locations in the array. Parallel algorithms for k-compaction have obvious applications in processor allocation and load balancing; k-compaction is also an important subroutine in many recently developed parallel algorithms. We show that any EREW PRAM that solves the k-compaction problem requires Ω ( √ log n) time, even if the number of processors is arbitrarily large and k = 2. On the CREW PRAM, we show that every n-processor algorithm for k-compaction problem requires Ω(log log n) time, even if k = 2. Finally, we show that O(log k) time can be achieved on the ROBUST PRAM, a very weak CRCW PRAM model.
A Note on Reducing Parallel Model Simulations to Integer Sorting
, 1995
"... We show that simulating a step of a fetch&add pram model on an erew pram model can be made as efficient as integer sorting. In particular, we present several efficient reductions of the simulation problem to various integer sorting problems. By using some recent algorithms for integer sorting, we ge ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
We show that simulating a step of a fetch&add pram model on an erew pram model can be made as efficient as integer sorting. In particular, we present several efficient reductions of the simulation problem to various integer sorting problems. By using some recent algorithms for integer sorting, we get simulation algorithms on crew and erew that take o(n lg n) operations where n is the number of processors in the simulated crcw machine. Previous simulations were using \Theta(n lg n) operations. Some of the more interesting simulation results are obtained by using a bootstrapping technique with a crcw pram algorithm for hashing. 1 Introduction The concurrent-read concurrent-write (crcw) pram programmer's model is commonly used for designing parallel algorithms. On the other hand, the weaker exclusive-write pram models are sometimes considered closer to realization. Therefore, while it is more convenient to design algorithms for the stronger crcw model, an extra effort is sometimes neede...
Simple Fast Parallel Hashing by Oblivious Execution
- AT&T Bell Laboratories
, 1994
"... A hash table is a representation of a set in a linear size data structure that supports constanttime membership queries. We show how to construct a hash table for any given set of n keys in O(lg lg n) parallel time with high probability, using n processors on a weak version of a crcw pram. Our algo ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
A hash table is a representation of a set in a linear size data structure that supports constanttime membership queries. We show how to construct a hash table for any given set of n keys in O(lg lg n) parallel time with high probability, using n processors on a weak version of a crcw pram. Our algorithm uses a novel approach of hashing by "oblivious execution" based on probabilistic analysis to circumvent the parity lower bound barrier at the near-logarithmic time level. The algorithm is simple and is sketched by the following: 1. Partition the input set into buckets by a random polynomial of constant degree. 2. For t := 1 to O(lg lg n) do (a) Allocate M t memory blocks, each of size K t . (b) Let each bucket select a block at random, and try to injectively map its keys into the block using a random linear function. Buckets that fail carry on to the next iteration. The crux of the algorithm is a careful a priori selection of the parameters M t and K t . The algorithm uses only O(lg lg...
Tight Bounds for Parallel Randomized Load Balancing
- Computing Research Repository
, 1992
"... We explore the fundamental limits of distributed balls-intobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that non-adaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We explore the fundamental limits of distributed balls-intobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that non-adaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ(loglogn/logloglogn) within the same number of rounds. We present an adaptive symmetric algorithm that achieves a bin load of two in log ∗ n + O(1) communication rounds using O(n) messages in total. Moreover, larger bin loads can be traded in for smaller time complexities. We prove a matching lower bound of (1−o(1))log ∗ n on the time complexity of symmetric algorithms that guarantee small bin loads at an asymptotically optimal message complexity of O(n). The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls are not globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time. As an application, we consider the following problem. Given a fully connected graph of n nodes, where each node needs to send and receive up to n messages, and in each round each node may send one message over each link, deliver all messages as quickly as possible to their destinations. We give a simple and robust algorithm of time complexity O(log ∗ n) for this task and provide a generalization to the case where all nodes initially hold arbitrary sets of messages. Completing the picture, we give a less practical, but asymptotically optimal algorithm terminating within O(1) rounds. All these bounds hold with high probability.
Efficient Parallel Computing with Memory Faults
"... . In this paper we show two results on PRAM with constant fraction of memory faults. First we show how to preprocess (i.e. connect a constant fraction of processors into a binary tree) a faulty EREW PRAM with n= log n processors and O(n) memory cells in O(log n) time. The preprocessing is a basic st ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. In this paper we show two results on PRAM with constant fraction of memory faults. First we show how to preprocess (i.e. connect a constant fraction of processors into a binary tree) a faulty EREW PRAM with n= log n processors and O(n) memory cells in O(log n) time. The preprocessing is a basic step of simulations from [7,9,17]. Our algorithm, together with the results from [17], gives a first fully work-optimal randomized simulations of EREW on EREW with faults with logarithmic overhead. In the second part of this paper, we consider the CRCW PRAM with memory faults. We show that (after O(log n)-time preprocessing) any algorithm for O(n)-processor PRAM can be simulated with optimal work in O(log n) time on CRCW with memory faults. The simulation improves the result of [7]. All simulations assume static faults, i.e. that the errors are determined before the computation starts and that no new errors occur during the computation. 1 Introduction The increasing complexity of multiproc...

