Results 11 - 20
of
152
Parallel Randomized Load Balancing
- In Symposium on Theory of Computing. ACM
, 1995
"... It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the ..."
Abstract
-
Cited by 51 (8 self)
- Add to MetaCart
It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the least full of its chosen bins [2]. They show that the fullest bin contains only log log n= log d + \Theta(1) balls with high probability. We explore extensions of this result to parallel and distributed settings. Our results focus on the tradeoff between the amount of communication and the final load. Given r rounds of communication, we provide lower bounds on the maximum load of \Omega\Gamma r p log n= log log n) for a wide class of strategies. Our results extend to the case where the number of rounds is allowed to grow with n. We then demonstrate parallelizations of the sequential strategy presented in Azar et al. that achieve loads within a constant factor of the lower bound for two ...
BALANCED ALLOCATIONS: THE HEAVILY LOADED CASE
, 2006
"... We investigate balls-into-bins processes allocating m balls into n bins based on the multiple-choice paradigm. In the classical single-choice variant each ball is placed into a bin selected uniformly at random. In a multiple-choice process each ball can be placed into one out of d ≥ 2 randomly selec ..."
Abstract
-
Cited by 51 (6 self)
- Add to MetaCart
We investigate balls-into-bins processes allocating m balls into n bins based on the multiple-choice paradigm. In the classical single-choice variant each ball is placed into a bin selected uniformly at random. In a multiple-choice process each ball can be placed into one out of d ≥ 2 randomly selected bins. It is known that in many scenarios having more than one choice for each ball can improve the load balance significantly. Formal analyses of this phenomenon prior to this work considered mostly the lightly loaded case, that is, when m ≈ n. In this paper we present the first tight analysis in the heavily loaded case, that is, when m ≫ n rather than m ≈ n. The best previously known results for the multiple-choice processes in the heavily loaded case were obtained using majorization by the single-choice process. This yields an upper bound of the maximum load of bins of m/n + O ( √ m ln n/n) with high probability. We show, however, that the multiple-choice processes are fundamentally different from the single-choice variant in that they have “short memory. ” The great consequence of this property is that the deviation of the multiple-choice processes from the optimal allocation (that is, the allocation in which each bin has either ⌊m/n ⌋ or ⌈m/n ⌉ balls) does not increase with the number of balls as in the case of the single-choice process. In particular, we investigate the allocation obtained by two different multiple-choice allocation schemes,
Fast hash table lookup using extended Bloom filter: an aid to network processing
- In ACM SIGCOMM
, 2005
"... ..."
On the Analysis of Randomized Load Balancing Schemes
- IN PROCEEDINGS OF THE 9TH ANNUAL ACM SYMPOSIUM ON PARALLEL ALGORITHMS AND ARCHITECTURES
, 1998
"... It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends ..."
Abstract
-
Cited by 48 (7 self)
- Add to MetaCart
It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends a previous analysis of the supermarket model, a model that abstracts a simple, efficient load balancing scheme in the setting where jobs arrive at a large system of parallel processors. In this model, customers arrive at a system of n servers as a Poisson stream of rate #n, # < 1, with service requirements exponentially distributed with mean 1. Each customer chooses d servers independently and uniformly at random from the n servers, and is served according to the First In First Out (FIFO) protocol at the choice with the fewest customers. For the supermarket model, it has been shown that using d = 2 choices yields an exponential improvement in the expected time a customer spends in the syst...
A Stochastic Process on the Hypercube with Applications to Peer-to-Peer Networks (Extended Abstract)
- In Proceedings STOC
, 2003
"... Micah Adler # Department of Computer Science, University of Massachusetts, Amherst, MA 01003-4610, USA micah@cs.umass.edu Eran Halperin + Science Institute and Computer Science Division University of California Berkeley, CA 94720. ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
Micah Adler # Department of Computer Science, University of Massachusetts, Amherst, MA 01003-4610, USA micah@cs.umass.edu Eran Halperin + Science Institute and Computer Science Division University of California Berkeley, CA 94720.
The Content-Addressable Network D2B
, 2003
"... A content-addressable network (CAN) is a distributed lookup table that can be used to implement peer-to-peer (P2P) systems. A CAN allows the discovery and location of data and/or resources, identi ed by keys, in a distributed network (e.g., Internet), in absence of centralized server or any hier ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
A content-addressable network (CAN) is a distributed lookup table that can be used to implement peer-to-peer (P2P) systems. A CAN allows the discovery and location of data and/or resources, identi ed by keys, in a distributed network (e.g., Internet), in absence of centralized server or any hierarchical organization. Several networks have been recently described in the literature, and some of them have led to the development of experimental systems. We present a new CAN, called d2b. Its main characteristics are: simplicity, provability, and scalability. d2b allows the number of nodes n to vary between 1 and jKj where K is the set of keys managed by the network. In term of performances, any join or leave of a user implies a constant expected number of link modi cations, and, with high probability (w.h.p.), at most O(log n) link modi cations.
A Generic Scheme for Building Overlay Networks in Adversarial Scenarios
- IN PROCEEDINGS OF THE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS
, 2003
"... This paper presents a generic scheme for a central, yet untackled issue in overlay dynamic networks: maintaining stability over long life and against malicious adversaries. The generic scheme maintains desirable properties of the underlying structure including low diameter, and efficient routing mec ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
This paper presents a generic scheme for a central, yet untackled issue in overlay dynamic networks: maintaining stability over long life and against malicious adversaries. The generic scheme maintains desirable properties of the underlying structure including low diameter, and efficient routing mechanism, as well as balanced node dispersal. These desired properties are maintained in a decentralized manner without resorting to global updates or periodic stabilization protocols even against an adaptive adversary that controls the arrival and departure of nodes.
Approximate Equilibria and Ball Fusion
- Theory of Computing Systems
, 2002
"... We consider sel sh routing over a network consisting of m parallel links through which n sel sh users route their tra c trying to minimize their own expected latency. Westudy the class of mixed strategies in which the expected latency through each link is at most a constant multiple of the optimum m ..."
Abstract
-
Cited by 45 (21 self)
- Add to MetaCart
We consider sel sh routing over a network consisting of m parallel links through which n sel sh users route their tra c trying to minimize their own expected latency. Westudy the class of mixed strategies in which the expected latency through each link is at most a constant multiple of the optimum maximum latency had global regulation been available. For the case of uniform links it is known that all Nash equilibria belong to this class of strategies. We areinterested in bounding the coordination ratio (or price of anarchy) of these strategies de ned as the worst-case ratio of the maximum (over all links) expected latency over the optimum maximum latency. The load balancing aspect of the problem immediately implies a lower bound; lnm ln lnm of the coordination ratio. We give a tight (uptoamultiplicative constant) upper bound. To show the upper bound, we analyze a variant ofthe classical balls and bins problem, in which balls with arbitrary weights are placed into bins according to arbitrary probability distributions. At the heart of our approach is a new probabilistic tool that we call
Fast Concurrent Access to Parallel Disks
- In 11th ACM-SIAM Symposium on Discrete Algorithms
, 1999
"... High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is ..."
Abstract
-
Cited by 44 (11 self)
- Add to MetaCart
High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for adapting single-disk external memory algorithms to multiple disks. This paper shows that this problem can be solved efficiently using a combination of randomized placement, redundancy and an optimal scheduling algorithm. A buffer of O(D) blocks suffices to support efficient writing of arbitrary blocks if blocks are distributed uniformly at random to the disks (e.g., by hashing). If two randomly allocated copies of each block exist, N arbitrary blocks can be read within dN=De + 1 I/O steps with high probability. In addition, the redundancy can be reduced from 2 to 1 + 1=r for any integer r. These results can be used to emulate the simple and powerful "single-disk multi-head" model of external computing [1] on the physically more realistic independent disk model [33] with small constant overhead. This is faster than a lower bound for deterministic emulation [3].
Interpreting Stale Load Information
- IEEE Transactions on parallel and distributed systems
, 1999
"... In this paper we examine the problem of balancing load in a large-scale distributed system when information about server loads may be stale. It is well known that sending each request to the machine with the apparent lowest load can behave badly in such systems, yet this technique is common in pr ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
In this paper we examine the problem of balancing load in a large-scale distributed system when information about server loads may be stale. It is well known that sending each request to the machine with the apparent lowest load can behave badly in such systems, yet this technique is common in practice. Other systems use round-robin or random selection algorithms that entirely ignore load information or that only use a small subset of the load information. Rather than risk extremely bad performance on one hand or ignore the chance to use load information to improve performance on the other, we develop strategies that interpret load information based on its age. Through simulation, we examine several simple algorithms that use such load interpretation strategies under a range of workloads. Our experiments suggest that by properly interpreting load information, systems can (1) match the performance of the most aggressive algorithms when load information is fresh relative to the...

