Results 11  20
of
189
BALANCED ALLOCATIONS: THE HEAVILY LOADED CASE
, 2006
"... We investigate ballsintobins processes allocating m balls into n bins based on the multiplechoice paradigm. In the classical singlechoice variant each ball is placed into a bin selected uniformly at random. In a multiplechoice process each ball can be placed into one out of d ≥ 2 randomly selec ..."
Abstract

Cited by 57 (7 self)
 Add to MetaCart
We investigate ballsintobins processes allocating m balls into n bins based on the multiplechoice paradigm. In the classical singlechoice variant each ball is placed into a bin selected uniformly at random. In a multiplechoice process each ball can be placed into one out of d ≥ 2 randomly selected bins. It is known that in many scenarios having more than one choice for each ball can improve the load balance significantly. Formal analyses of this phenomenon prior to this work considered mostly the lightly loaded case, that is, when m ≈ n. In this paper we present the first tight analysis in the heavily loaded case, that is, when m ≫ n rather than m ≈ n. The best previously known results for the multiplechoice processes in the heavily loaded case were obtained using majorization by the singlechoice process. This yields an upper bound of the maximum load of bins of m/n + O ( √ m ln n/n) with high probability. We show, however, that the multiplechoice processes are fundamentally different from the singlechoice variant in that they have “short memory. ” The great consequence of this property is that the deviation of the multiplechoice processes from the optimal allocation (that is, the allocation in which each bin has either ⌊m/n ⌋ or ⌈m/n ⌉ balls) does not increase with the number of balls as in the case of the singlechoice process. In particular, we investigate the allocation obtained by two different multiplechoice allocation schemes,
A stochastic process on the hypercube with applications to peertopeer networks
 Proc. STOC 2003
"... Consider the following stochastic process executed on a graph G = (V, E) whose nodes are initially uncovered. In each step, pick a node at random and if it is uncovered, cover it. Otherwise, if it has an uncovered neighbor, cover a random uncovered neighbor. Else, do nothing. This can be viewed as a ..."
Abstract

Cited by 57 (2 self)
 Add to MetaCart
Consider the following stochastic process executed on a graph G = (V, E) whose nodes are initially uncovered. In each step, pick a node at random and if it is uncovered, cover it. Otherwise, if it has an uncovered neighbor, cover a random uncovered neighbor. Else, do nothing. This can be viewed as a structured coupon collector process. We show that for a large family of graphs, O(n) steps suffice to cover all nodes of the graph with high probability, where n is the number of vertices. Among these graphs are dregular graphs with d = Ω(log n log log n), random dregular graphs with d = Ω(log n) and the kdimensional hypercube where n = 2 k. This process arises naturally in answering a question on load balancing in peertopeer networks. We consider a distributed hash table in which keys are partitioned across a set of processors, and we assume that the number of processors
Fast hash table lookup using extended Bloom filter: an aid to network processing
 In ACM SIGCOMM
, 2005
"... ..."
Parallel Randomized Load Balancing
 In Symposium on Theory of Computing. ACM
, 1995
"... It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the ..."
Abstract

Cited by 56 (8 self)
 Add to MetaCart
It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the least full of its chosen bins [2]. They show that the fullest bin contains only log log n= log d + \Theta(1) balls with high probability. We explore extensions of this result to parallel and distributed settings. Our results focus on the tradeoff between the amount of communication and the final load. Given r rounds of communication, we provide lower bounds on the maximum load of \Omega\Gamma r p log n= log log n) for a wide class of strategies. Our results extend to the case where the number of rounds is allowed to grow with n. We then demonstrate parallelizations of the sequential strategy presented in Azar et al. that achieve loads within a constant factor of the lower bound for two ...
On the Analysis of Randomized Load Balancing Schemes
 IN PROCEEDINGS OF THE 9TH ANNUAL ACM SYMPOSIUM ON PARALLEL ALGORITHMS AND ARCHITECTURES
, 1998
"... It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends ..."
Abstract

Cited by 55 (7 self)
 Add to MetaCart
It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends a previous analysis of the supermarket model, a model that abstracts a simple, efficient load balancing scheme in the setting where jobs arrive at a large system of parallel processors. In this model, customers arrive at a system of n servers as a Poisson stream of rate #n, # < 1, with service requirements exponentially distributed with mean 1. Each customer chooses d servers independently and uniformly at random from the n servers, and is served according to the First In First Out (FIFO) protocol at the choice with the fewest customers. For the supermarket model, it has been shown that using d = 2 choices yields an exponential improvement in the expected time a customer spends in the syst...
Approximate Equilibria and Ball Fusion
 Theory of Computing Systems
, 2002
"... We consider sel sh routing over a network consisting of m parallel links through which n sel sh users route their tra c trying to minimize their own expected latency. Westudy the class of mixed strategies in which the expected latency through each link is at most a constant multiple of the optimum m ..."
Abstract

Cited by 54 (23 self)
 Add to MetaCart
We consider sel sh routing over a network consisting of m parallel links through which n sel sh users route their tra c trying to minimize their own expected latency. Westudy the class of mixed strategies in which the expected latency through each link is at most a constant multiple of the optimum maximum latency had global regulation been available. For the case of uniform links it is known that all Nash equilibria belong to this class of strategies. We areinterested in bounding the coordination ratio (or price of anarchy) of these strategies de ned as the worstcase ratio of the maximum (over all links) expected latency over the optimum maximum latency. The load balancing aspect of the problem immediately implies a lower bound; lnm ln lnm of the coordination ratio. We give a tight (uptoamultiplicative constant) upper bound. To show the upper bound, we analyze a variant ofthe classical balls and bins problem, in which balls with arbitrary weights are placed into bins according to arbitrary probability distributions. At the heart of our approach is a new probabilistic tool that we call
A generic scheme for building overlay networks in adversarial scenarios
 In Proc. Intl. Parallel and Distributed Processing Symp
, 2003
"... This paper presents a generic scheme for a central, yet untackled issue in overlay dynamic networks: maintaining stability over long life and against malicious adversaries. The generic scheme maintains desirable properties of the underlying structure including low diameter, and efficient routing mec ..."
Abstract

Cited by 52 (6 self)
 Add to MetaCart
This paper presents a generic scheme for a central, yet untackled issue in overlay dynamic networks: maintaining stability over long life and against malicious adversaries. The generic scheme maintains desirable properties of the underlying structure including low diameter, and efficient routing mechanism, as well as balanced node dispersal. These desired properties are maintained in a decentralized manner without resorting to global updates or periodic stabilization protocols even against an adaptive adversary that controls the arrival and departure of nodes. 1
Fast Concurrent Access to Parallel Disks
"... High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is ..."
Abstract

Cited by 50 (11 self)
 Add to MetaCart
High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for the efficient adaptation of singledisk external memory algorithms to multiple disks. We solve this problem for arbitrary access patterns by randomly mapping blocks of a logical address space to the disks. We show that a shared buffer of O(D) blocks suffices to support efficient writing. The analysis uses the properties of negative association to handle dependencies between the random variables involved. This approach might be of independent interest for probabilistic analysis in general. If two randomly allocated copies of each block exist, N arbitrary blocks can be read within dN=De + 1 I/O steps with high probability. The redundancy can be further reduced from 2 to 1 + 1=r for any integer r without a big impact on reading efficiency. From the point of view of external memory models, these results rehabilitate Aggarwal and Vitter's "singledisk multihead" model [1] that allows access to D arbitrary blocks in each I/O step. This powerful model can be emulated on the physically more realistic independent disk model [2] with small constant overhead factors. Parallel disk external memory algorithms can therefore be developed in the multihead model first. The emulation result can then be applied directly or further refinements can be added.
The ContentAddressable Network D2B
, 2003
"... A contentaddressable network (CAN) is a distributed lookup table that can be used to implement peertopeer (P2P) systems. A CAN allows the discovery and location of data and/or resources, identi ed by keys, in a distributed network (e.g., Internet), in absence of centralized server or any hier ..."
Abstract

Cited by 49 (2 self)
 Add to MetaCart
A contentaddressable network (CAN) is a distributed lookup table that can be used to implement peertopeer (P2P) systems. A CAN allows the discovery and location of data and/or resources, identi ed by keys, in a distributed network (e.g., Internet), in absence of centralized server or any hierarchical organization. Several networks have been recently described in the literature, and some of them have led to the development of experimental systems. We present a new CAN, called d2b. Its main characteristics are: simplicity, provability, and scalability. d2b allows the number of nodes n to vary between 1 and jKj where K is the set of keys managed by the network. In term of performances, any join or leave of a user implies a constant expected number of link modi cations, and, with high probability (w.h.p.), at most O(log n) link modi cations.
Interpreting Stale Load Information
 IEEE Transactions on parallel and distributed systems
, 1999
"... In this paper we examine the problem of balancing load in a largescale distributed system when information about server loads may be stale. It is well known that sending each request to the machine with the apparent lowest load can behave badly in such systems, yet this technique is common in pr ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
In this paper we examine the problem of balancing load in a largescale distributed system when information about server loads may be stale. It is well known that sending each request to the machine with the apparent lowest load can behave badly in such systems, yet this technique is common in practice. Other systems use roundrobin or random selection algorithms that entirely ignore load information or that only use a small subset of the load information. Rather than risk extremely bad performance on one hand or ignore the chance to use load information to improve performance on the other, we develop strategies that interpret load information based on its age. Through simulation, we examine several simple algorithms that use such load interpretation strategies under a range of workloads. Our experiments suggest that by properly interpreting load information, systems can (1) match the performance of the most aggressive algorithms when load information is fresh relative to the...