Results 1 - 10
of
17
The Power of Two Random Choices: A Survey of Techniques and Results
- in Handbook of Randomized Computing
, 2000
"... ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately ..."
Abstract
-
Cited by 79 (2 self)
- Add to MetaCart
ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately log n= log log n with high probability. Now suppose instead that the balls are placed sequentially, and each ball is placed in the least loaded of d 2 bins chosen independently and uniformly at random. Azar, Broder, Karlin, and Upfal showed that in this case, the maximum load is log log n= log d + (1) with high probability [ABKU99]. The important implication of this result is that even a small amount of choice can lead to drastically different results in load balancing. Indeed, having just two random choices (i.e.,...
Packet Routing In Fixed-Connection Networks: A Survey
, 1998
"... We survey routing problems on fixed-connection networks. We consider many aspects of the routing problem and provide known theoretical results for various communication models. We focus on (partial) permutation, k-relation routing, routing to random destinations, dynamic routing, isotonic routing ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
We survey routing problems on fixed-connection networks. We consider many aspects of the routing problem and provide known theoretical results for various communication models. We focus on (partial) permutation, k-relation routing, routing to random destinations, dynamic routing, isotonic routing, fault tolerant routing, and related sorting results. We also provide a list of unsolved problems and numerous references.
Efficient hashing with lookups in two memory accesses, in: 16th
- SODA, ACM-SIAM
"... The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the c ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the concept of two-way hashing where the largest bucket contains O(log log n) balls with high probability. The hash look up will now search in both the buckets an item hashes to. Since an item may be placed in one of two buckets, we could potentially move an item after it has been initially placed to reduce maximum load. Using this fact, we present a simple, practical hashing scheme that maintains a maximum load of 2, with high probability, while achieving high memory utilization. In fact, with n buckets, even if the space for two items are pre-allocated per bucket, as may be desirable in hardware implementations, more than n items can be stored giving a high memory utilization. Assuming truly random hash functions, we prove the following properties for our hashing scheme. • Each lookup takes two random memory accesses, and reads at most two items per access. • Each insert takes O(log n) time and up to log log n+ O(1) moves, with high probability, and constant time in expectation. • Maintains 83.75 % memory utilization, without requiring dynamic allocation during inserts. We also analyze the trade-off between the number of moves performed during inserts and the maximum load on a bucket. By performing at most h moves, we can maintain a maximum load of O(hlogl((~og~og:n/h)). So, even by performing one move, we achieve a better bound than by performing no moves at all. 1
Contention Resolution in Hashing Based Shared Memory Simulations
"... In this paper we study the problem of simulating shared memory on the Distributed Memory Machine (DMM). Our approach uses multiple copies of shared memory cells, distributed among the memory modules of the DMM via universal hashing. Thus the main problem is to design strategies that resolve cont ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
In this paper we study the problem of simulating shared memory on the Distributed Memory Machine (DMM). Our approach uses multiple copies of shared memory cells, distributed among the memory modules of the DMM via universal hashing. Thus the main problem is to design strategies that resolve contention at the memory modules. Developing ideas from random graphs and very fast randomized algorithms, we present new simulation techniques that enable us to improve the previously best results exponentially. Particularly, we show that an n-processor CRCW PRAM can be simulated by an n-processor DMM with delay O(log log log n log n), with high probability. Next we show a general technique that can be used to turn these simulations to time-processor optimal ones, in the case of EREW PRAMs to be simulated. We obtain a time-processor optimal simulation of an (n log log log n log n)-processor EREW PRAM on an n-processor DMM with O(log log log n log n) delay. When a CRCW PRAM with (n...
On the effectiveness of D-BSP as a bridging model of parallel computation
- IN PROC. OF THE INT. CONFERENCE ON COMPUTATIONAL SCIENCE, LNCS 2074
, 2001
"... This paper surveys and places into perspective a number of results concerning the D-BSP (Decomposable Bulk Synchronous Parallel) model of computation, a variant of the popular BSP model proposed by Valiant in the early nineties. D-BSP captures part of the proximity structure of the computing platfor ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
This paper surveys and places into perspective a number of results concerning the D-BSP (Decomposable Bulk Synchronous Parallel) model of computation, a variant of the popular BSP model proposed by Valiant in the early nineties. D-BSP captures part of the proximity structure of the computing platform, modeling it by suitable decompositions into clusters, each characterized by its own bandwidth and latency parameters. Quantitative evidence is provided that, when modeling realistic parallel architectures, D-BSP achieves higher effectiveness and portability than BSP, without significantly affecting the ease of use. It is also shown that D-BSP avoids some of the shortcomings of BSP which motivated the definition of other variants of the model. Finally, the paper discusses how the aspects of network proximity incorporated in the model allow for a better management of network congestion and bank contention, when supporting a shared-memory abstraction in a distributed-memory environment. 1
Shared-Memory Simulations on a Faulty-Memory DMM
, 1996
"... this paper are synchronous, and the time performance is our major efficiency criterion. We consider a DMM with faulty memory words, otherwise everything is assumed to be operational. In particular the communication between the processors and the MUs is reliable, and a processor may always attempt to ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
this paper are synchronous, and the time performance is our major efficiency criterion. We consider a DMM with faulty memory words, otherwise everything is assumed to be operational. In particular the communication between the processors and the MUs is reliable, and a processor may always attempt to obtain an access to any MU, and, having been granted it, may access any memory word in it, even if all of them are faulty. The only restriction on the distribution of faults among memory words is that their total number is bounded from above by a fraction of the total number of memory words in all the MUs. In particular, some MUs may contain only operational cells, some only faulty cells, and some mixed cells. This report presents fast simulations of the PRAM on a DMM with faulty memory.
Simulating shared memory in real time: On the computation power of reconfigurable meshes
- in ``Proceedings of the 2nd IEEE Workshop on Reconfigurable Architectures
, 1995
"... We consider randomized simulations of shared memory on a distributed memory machine (DMM) where the n processors and the n memory modules of the DMM are connected via a reconfigurable architecture. We first present a randomized simulation of a CRCW PRAM on a reconfigurable DMM having a complete reco ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We consider randomized simulations of shared memory on a distributed memory machine (DMM) where the n processors and the n memory modules of the DMM are connected via a reconfigurable architecture. We first present a randomized simulation of a CRCW PRAM on a reconfigurable DMM having a complete reconfigurable interconnection. It guarantees delay O(log *n), with high probability. Next we study a reconfigurable mesh DMM (RM-DMM). Here the n processors and n modules are connected via an n_n reconfigurable mesh. It was already known that an n_m reconfigurable mesh can simulate in constant time an n-processor CRCW PRAM with shared memory of size m. In this paper we present a randomized step by step simulation of a CRCW PRAM with arbitrarily large shared memory on an RM-DMM. It guarantees constant delay with high probability, i.e., it simulates in real time. Finally we prove a lower bound showing that size 0(n 2) for the reconfigurable mesh is necessary for real time simulations.] 1997 Academic Press * Supported by DFG-Graduiertenkolleg ``Parallele Rechnernetzwerke in der Produktionstechnik,''
The Complexity of Deterministic PRAM Simulation on Distributed Memory Machines
, 1997
"... In this paper we present lower and upper bounds for the deterministic simulation of a Parallel Random Access Machine (PRAM) with n processors and m variables on a Distributed Memory Machine (DMM) with p n processors. The bounds are expressed as a function of the redundancy r of the scheme (i.e., th ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
In this paper we present lower and upper bounds for the deterministic simulation of a Parallel Random Access Machine (PRAM) with n processors and m variables on a Distributed Memory Machine (DMM) with p n processors. The bounds are expressed as a function of the redundancy r of the scheme (i.e., the number of copies used to represent each PRAM variable in the DMM), and become tight for any m polynomial in n and r = \Theta (1).
Constructive, Deterministic Implementation of Shared Memory on Meshes
- SIAM Journal on Computing
"... . This paper describes a scheme to implement a shared address space of size m on an n-node mesh, with m polynomial in n, where each mesh node hosts a processor and a memory module. At the core of the simulation is a Hierarchical Memory Organization Scheme (HMOS), which governs the distribution of th ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
. This paper describes a scheme to implement a shared address space of size m on an n-node mesh, with m polynomial in n, where each mesh node hosts a processor and a memory module. At the core of the simulation is a Hierarchical Memory Organization Scheme (HMOS), which governs the distribution of the shared variables, each replicated into multiple copies, among the memory modules, through a cascade of bipartite graphs. Based on the expansion properties of such graphs, we devise a protocol that accesses any n-tuple of shared variables in worst-case time O \Gamma n 1=2+j \Delta , for any constant j ? 0, using O \Gamma 1=j 1:59 \Delta copies per variable, or in worst-case time O \Gamma n 1=2 log n \Delta , using O \Gamma log 1:59 n \Delta copies per variable. In both cases the access time is close to the natural O \Gamma p n \Delta lower bound imposed by the network diameter. A key feature of the scheme is that it can be made fully constructive when m is not too ...
Constant Thinning Protocol for Routing h-Relations in Complete Networks
, 1998
"... We propose a simple protocol, called constant thinning protocol, for routing in a complete network under OCPC assumption, analyze it, and compare it with some other routing protocols. ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We propose a simple protocol, called constant thinning protocol, for routing in a complete network under OCPC assumption, analyze it, and compare it with some other routing protocols.

