Results 1  10
of
66
Randomized Algorithms
, 1995
"... Randomized algorithms, once viewed as a tool in computational number theory, have by now found widespread application. Growth has been fueled by the two major benefits of randomization: simplicity and speed. For many applications a randomized algorithm is the fastest algorithm available, or the simp ..."
Abstract

Cited by 1923 (39 self)
 Add to MetaCart
Randomized algorithms, once viewed as a tool in computational number theory, have by now found widespread application. Growth has been fueled by the two major benefits of randomization: simplicity and speed. For many applications a randomized algorithm is the fastest algorithm available, or the simplest, or both. A randomized algorithm is an algorithm that uses random numbers to influence the choices it makes in the course of its computation. Thus its behavior (typically quantified as running time or quality of output) varies from
Direct BulkSynchronous Parallel Algorithms
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1992
"... We describe a methodology for constructing parallel algorithms that are transportable among parallel computers having different numbers of processors, different bandwidths of interprocessor communication and different periodicity of global synchronisation. We do this for the bulksynchronous paralle ..."
Abstract

Cited by 171 (27 self)
 Add to MetaCart
We describe a methodology for constructing parallel algorithms that are transportable among parallel computers having different numbers of processors, different bandwidths of interprocessor communication and different periodicity of global synchronisation. We do this for the bulksynchronous parallel (BSP) model, which abstracts the characteristics of a parallel machine into three numerical parameters p, g, and L, corresponding to processors, bandwidth, and periodicity respectively. The model differentiates memory that is local to a processor from that which is not, but, for the sake of universality, does not differentiate network proximity. The advantages of this model in supporting shared memory or PRAM style programming have been treated elsewhere. Here we emphasise the viability of an alternative direct style of programming where, for the sake of efficiency the programmer retains control of memory allocation. We show that optimality to within a multiplicative factor close to one ca...
merging, and sorting in parallel models of computation
 in “Proc. 14th Annual ACM Sympos. on Theory of Cornput
, 1982
"... A variety of models have been proposed for the study of synchronous parallel computation. These models are reviewed and some prototype problems are studied further. Two classes of models are recognized, fixed connection networks and models based on a shared memory. Routing and sorting are prototype ..."
Abstract

Cited by 106 (3 self)
 Add to MetaCart
(Show Context)
A variety of models have been proposed for the study of synchronous parallel computation. These models are reviewed and some prototype problems are studied further. Two classes of models are recognized, fixed connection networks and models based on a shared memory. Routing and sorting are prototype problems for the networks; in particular, they provide the basis for simulating the more powerful shared memory models. It is shown that a simple but important class of deterministic strategies (oblivious routing) is necessarily inefficient with respect to worst case analysis. Routing can be viewed as a special case of sorting, and the existence of an O(log n) sorting algorithm for some n processor fixed connection network has only recently been established by Ajtai, Komlos, and Szemeredi (“15th ACM Sympos. on Theory of Cornput., ” Boston, Mass., 1983, pp. l9). If the more powerful class of shared memory models is considered then it is possible to simply achieve an O(log n loglog n) sort via Valiant’s parallel merging algorithm, which it is shown can be implemented on certain models. Within a spectrum of shared memory models, it is shown that loglogn is asymptotically optimal for n processors to merge two sorted lists containing n elements. 0 1985 Academic Press, Inc.
The Power of Two Random Choices: A Survey of Techniques and Results
 in Handbook of Randomized Computing
, 2000
"... ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately ..."
Abstract

Cited by 102 (2 self)
 Add to MetaCart
ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately log n= log log n with high probability. Now suppose instead that the balls are placed sequentially, and each ball is placed in the least loaded of d 2 bins chosen independently and uniformly at random. Azar, Broder, Karlin, and Upfal showed that in this case, the maximum load is log log n= log d + (1) with high probability [ABKU99]. The important implication of this result is that even a small amount of choice can lead to drastically different results in load balancing. Indeed, having just two random choices (i.e.,...
Randomized routing and sorting on fixedconnection networks
 JOURNAL OF ALGORITHMS
, 1994
"... This paper presents a general paradigm for the design of packet routing algorithms for fixedconnection networks. Its basis is a randomized online algorithm for scheduling any set of N packets whose paths have congestion c on any boundeddegree leveled network with depth L in O(c + L + log N) steps ..."
Abstract

Cited by 89 (13 self)
 Add to MetaCart
(Show Context)
This paper presents a general paradigm for the design of packet routing algorithms for fixedconnection networks. Its basis is a randomized online algorithm for scheduling any set of N packets whose paths have congestion c on any boundeddegree leveled network with depth L in O(c + L + log N) steps, using constantsize queues. In this paradigm, the design of a routing algorithm is broken into three parts: (1) showing that the underlying network can emulate a leveled network, (2) designing a path selection strategy for the leveled network, and (3) applying the scheduling algorithm. This strategy yields randomized algorithms for routing and sorting in time proportional to the diameter for meshes, butterflies, shuffleexchange graphs, multidimensional arrays, and hypercubes. It also leads to the construction of an areauniversal network: an Nnode network with area Θ(N) that can simulate any other network of area O(N) with slowdown O(log N).
Models of Machines and Computation for Mapping in Multicomputers
, 1993
"... It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonlyaccepted framework ..."
Abstract

Cited by 80 (1 self)
 Add to MetaCart
It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonlyaccepted framework whereby results in the field can be compared. Nor is it always easy to assess the relevance of a new result to a particular problem. Furthermore, changes in parallel computing technology have made some of the earlier work of less relevance to current multiprocessor systems. Versions of the mapping problem are classified, and research in the field is considered in terms of its relevance to the problem of programming currently available hardware in the form of a distributed memory multiple instruction stream multiple data stream computer: a multicomputer.
Randomized Routing on FatTrees
 Advances in Computing Research
, 1996
"... Fattrees are a class of routing networks for hardwareefficient parallel computation. This paper presents a randomized algorithm for routing messages on a fattree. The quality of the algorithm is measured in terms of the load factor of a set of messages to be routed, which is a lower bound on the ..."
Abstract

Cited by 51 (11 self)
 Add to MetaCart
(Show Context)
Fattrees are a class of routing networks for hardwareefficient parallel computation. This paper presents a randomized algorithm for routing messages on a fattree. The quality of the algorithm is measured in terms of the load factor of a set of messages to be routed, which is a lower bound on the time required to deliver the messages. We show that if a set of messages has load factor on a fattree with n processors, the number of delivery cycles (routing attempts) that the algorithm requires is O(+lg n lg lg n) with probability 1 \Gamma O(1=n). The best previous bound was O( lg n) for the offline problem in which the set of messages is known in advance. In the context of a VLSI model that equates hardware cost with physical volume, the routing algorithm can be used to demonstrate that fattrees are universal routing networks. Specifically, we prove that any routing network can be efficiently simulated by a fattree of comparable hardware cost. 1 Introduction Fattrees constitute...
Fast Algorithms for BitSerial Routing on a Hypercube
, 1991
"... In this paper, we describe an O(log N)bitstep randomized algorithm for bitserial message routing on a hypercube. The result is asymptotically optimal, and improves upon the best previously known algorithms by a logarithmic factor. The result also solves the problem of online circuit switching in ..."
Abstract

Cited by 37 (10 self)
 Add to MetaCart
In this paper, we describe an O(log N)bitstep randomized algorithm for bitserial message routing on a hypercube. The result is asymptotically optimal, and improves upon the best previously known algorithms by a logarithmic factor. The result also solves the problem of online circuit switching in an O(1)dilated hypercube (i.e., the problem of establishing edgedisjoint paths between the nodes of the dilated hypercube for any onetoone mapping). Our algorithm is adaptive and we show that this is necessary to achieve the logarithmic speedup. We generalize the BorodinHopcroft lower bound on oblivious routing by proving that any randomized oblivious algorithm on a polylogarithmic degree network requires at least \Omega\Gammaast 2 N= log log N) bit steps with high probability for almost all permutations. 1 Introduction Substantial effort has been devoted to the study of storeandforward packet routing algorithms for hypercubic networks. The fastest algorithms are randomized, and c...
An optical simulation of shared memory
, 1994
"... We present a workoptimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
We present a workoptimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module per processor. Each memory module only services a request on a timestep if it receives exactly one memory request. Our algorithm simulates each step of an n lg lg nprocessor erew pram on an nprocessor ocpc in O(lg lg n) expected delay. (The probability that the delay is longer than this is at most n; for any constant.) The best previous simulation, due to Valiant, required (lg n) expected delay.