Results 1 
7 of
7
Balanced Allocations: The Weighted Case
, 2008
"... We investigate ballsandbins processes where m weighted balls are placed into n bins using the “power of two choices ” paradigm, whereby a ball is inserted into the less loaded of two randomly chosen bins. The case where each of the m balls has unit weight had been studied extensively. In a seminal ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We investigate ballsandbins processes where m weighted balls are placed into n bins using the “power of two choices ” paradigm, whereby a ball is inserted into the less loaded of two randomly chosen bins. The case where each of the m balls has unit weight had been studied extensively. In a seminal paper Azar et al. [2] showed that when m = n the most loaded bin has Θ(log log n) balls with high probability. Surprisingly, the gap in load between the heaviest bin and the average bin does not increase with m and was shown by Berenbrink et al. [4] to be Θ(log log n) with high probability for arbitrarily large m. We generalize this result to the weighted case where balls have weights drawn from an arbitrary weight distribution. We show that as long as the weight distribution has finite second moment and satisfies a mild technical condition, the gap between the weight of the heaviest bin and the weight of the average bin is independent of the number balls thrown. This is especially striking when considering heavy tailed distributions such as PowerLaw and LogNormal distributions. In these cases, as more balls are thrown, heavier and heavier weights are encountered. Nevertheless with high probability, the imbalance in the load distribution does not increase. Furthermore, if the fourth moment of the weight distribution is finite, the expected value of the gap is shown to be independent of the number of balls. 1 1
Tight Bounds for Parallel Randomized Load Balancing
 Computing Research Repository
, 1992
"... We explore the fundamental limits of distributed ballsintobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that nonadaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We explore the fundamental limits of distributed ballsintobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that nonadaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ(loglogn/logloglogn) within the same number of rounds. We present an adaptive symmetric algorithm that achieves a bin load of two in log ∗ n + O(1) communication rounds using O(n) messages in total. Moreover, larger bin loads can be traded in for smaller time complexities. We prove a matching lower bound of (1−o(1))log ∗ n on the time complexity of symmetric algorithms that guarantee small bin loads at an asymptotically optimal message complexity of O(n). The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls are not globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time. As an application, we consider the following problem. Given a fully connected graph of n nodes, where each node needs to send and receive up to n messages, and in each round each node may send one message over each link, deliver all messages as quickly as possible to their destinations. We give a simple and robust algorithm of time complexity O(log ∗ n) for this task and provide a generalization to the case where all nodes initially hold arbitrary sets of messages. Completing the picture, we give a less practical, but asymptotically optimal algorithm terminating within O(1) rounds. All these bounds hold with high probability.
Balanced Relay Allocation on Heterogeneous Unstructured Overlays
"... Abstract — Due to the increased usage of NAT boxes and firewalls, it has become harder for applications to establish direct connections seamlessly among two endhosts. A recently adopted proposal to mitigate this problem is to use relay nodes, endhosts that act as intermediary points to bridge conn ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract — Due to the increased usage of NAT boxes and firewalls, it has become harder for applications to establish direct connections seamlessly among two endhosts. A recently adopted proposal to mitigate this problem is to use relay nodes, endhosts that act as intermediary points to bridge connections. Efficiently selecting a relay node is not a trivial problem, specially in a largescale unstructured overlay system where endhosts are heterogeneous. In such environment, heterogeneity among the relay nodes comes from the inherent differences in their capacities and from the way overlay networks are constructed. Despite this fact, good relay selection algorithms should effectively balance the aggregate load across the set of relay nodes. In this paper, we address this problem using algorithms based on the two random choices method. We first prove that the classic loadbased algorithm can effectively balance the load even when relays are heterogeneous, and that its performance depends directly on relay heterogeneity. Second, we propose an utilizationbased random choice algorithm to distribute load in order to balance relay utilization. Numerical evaluations through simulations illustrate the effectiveness of this algorithm, indicating that it might also yield provable performance (which we conjecture). Finally, we support our theoretical findings through simulations of various largescale scenarios, with realistic relay heterogeneity. I.
The (1 + β)Choice Process and Weighted BallsintoBins
"... Suppose m balls are sequentially thrown into n bins where each ball goes into a random bin. It is wellknown that the gap between the load of the most loaded bin m log n and the average is Θ( n), for large m. If each ball goes to the lesser loaded of two random bins, this gap dramatically reduces to ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Suppose m balls are sequentially thrown into n bins where each ball goes into a random bin. It is wellknown that the gap between the load of the most loaded bin m log n and the average is Θ( n), for large m. If each ball goes to the lesser loaded of two random bins, this gap dramatically reduces to Θ(log log n) independent of m. Consider now the following “(1 + β)choice ” process for some parameter β ∈ (0, 1): each ball goes to a random bin with probability (1−β) and the lesser loaded of two random bins with probability β. How does the gap for such a process behave? Suppose that the weight of each ball was drawn from a geometric distribution. How is the gap (now defined in terms of weight) affected? In this work, we develop general techniques for analyzing such ballsintobins processes. Specifically, we show that for the (1 + β)choice process above, the gap is Θ(log n/β), irrespective of m. Moreover the gap stays at Θ(log n/β) in the weighted case for a large class of weight distributions. No nontrivial explicit bounds were previously known in the weighted case, even for the 2choice paradigm. 1
Balls and Bins with Structure: Balanced Allocations on Hypergraphs
"... In the standard ballsandbins model of balanced allocations, m balls are placed sequentially into n bins. Each ball chooses d uniformrandom bins and is placed in the least loaded bin. It is well known that when d = log Θ(1) n, after placing m = n balls, the maximum load (number of balls in a bin) ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In the standard ballsandbins model of balanced allocations, m balls are placed sequentially into n bins. Each ball chooses d uniformrandom bins and is placed in the least loaded bin. It is well known that when d = log Θ(1) n, after placing m = n balls, the maximum load (number of balls in a bin) is Θ(1) w.h.p. In this paper we show that as long as d = Ω(log n), independent random choices are not necessary to achieve a constant load balance: these choices may be structured in a very general way. Specifically, we allow each ball i to have an associated random set of bins Bi. We require that Bi  = Ω(log n) and that bins are included in Bi with approximately the same probability; but the distributions of the Bis are otherwise arbitrary, so that there may be correlations in the choice of bins. We show that this model captures structure important to two applications, nearby server selection and load balance in distributed hash tables. 1
DISS. ETH NO. 19459 Synchronization and Symmetry Breaking in Distributed Systems
, 2010
"... accepted on the recommendation of ..."
11 Kinesis: A New Approach to Replica Placement in Distributed Storage Systems
"... Kinesis is a novel data placement model for distributed storage systems. It exemplifies three design principles: structure (division of servers into a few failureisolated segments), freedom of choice (freedom to allocate the best servers to store and retrieve data based on current resource availabi ..."
Abstract
 Add to MetaCart
Kinesis is a novel data placement model for distributed storage systems. It exemplifies three design principles: structure (division of servers into a few failureisolated segments), freedom of choice (freedom to allocate the best servers to store and retrieve data based on current resource availability), and scattered distribution (independent, pseudorandom spread of replicas in the system). These design principles enable storage systems to achieve balanced utilization of storage and network resources in the presence of incremental system expansions, failures of single and shared components, and skewed distributions of data size and popularity. In turn, this ability leads to significantly reduced resource provisioning costs, good userperceived response times, and fast, parallelized recovery from independent and correlated failures. This article validates Kinesis through theoretical analysis, simulations, and experiments on a prototype implementation. Evaluations driven by realworld traces show that Kinesis can significantly outperform the widely used Chain replicaplacement strategy in terms of resource requirements,