Results 1  10
of
13
Tight Bounds for Parallel Randomized Load Balancing
 Computing Research Repository
, 1992
"... We explore the fundamental limits of distributed ballsintobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that nonadaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
(Show Context)
We explore the fundamental limits of distributed ballsintobins algorithms, i.e., algorithms where balls act in parallel, as separate agents. This problem was introduced by Adler et al., who showed that nonadaptive and symmetric algorithms cannot reliably perform better than a maximum bin load of Θ(loglogn/logloglogn) within the same number of rounds. We present an adaptive symmetric algorithm that achieves a bin load of two in log ∗ n + O(1) communication rounds using O(n) messages in total. Moreover, larger bin loads can be traded in for smaller time complexities. We prove a matching lower bound of (1−o(1))log ∗ n on the time complexity of symmetric algorithms that guarantee small bin loads at an asymptotically optimal message complexity of O(n). The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls are not globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time. As an application, we consider the following problem. Given a fully connected graph of n nodes, where each node needs to send and receive up to n messages, and in each round each node may send one message over each link, deliver all messages as quickly as possible to their destinations. We give a simple and robust algorithm of time complexity O(log ∗ n) for this task and provide a generalization to the case where all nodes initially hold arbitrary sets of messages. Completing the picture, we give a less practical, but asymptotically optimal algorithm terminating within O(1) rounds. All these bounds hold with high probability.
Balanced Allocations: The Weighted Case
, 2008
"... We investigate ballsandbins processes where m weighted balls are placed into n bins using the “power of two choices ” paradigm, whereby a ball is inserted into the less loaded of two randomly chosen bins. The case where each of the m balls has unit weight had been studied extensively. In a seminal ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
(Show Context)
We investigate ballsandbins processes where m weighted balls are placed into n bins using the “power of two choices ” paradigm, whereby a ball is inserted into the less loaded of two randomly chosen bins. The case where each of the m balls has unit weight had been studied extensively. In a seminal paper Azar et al. [2] showed that when m = n the most loaded bin has Θ(log log n) balls with high probability. Surprisingly, the gap in load between the heaviest bin and the average bin does not increase with m and was shown by Berenbrink et al. [4] to be Θ(log log n) with high probability for arbitrarily large m. We generalize this result to the weighted case where balls have weights drawn from an arbitrary weight distribution. We show that as long as the weight distribution has finite second moment and satisfies a mild technical condition, the gap between the weight of the heaviest bin and the weight of the average bin is independent of the number balls thrown. This is especially striking when considering heavy tailed distributions such as PowerLaw and LogNormal distributions. In these cases, as more balls are thrown, heavier and heavier weights are encountered. Nevertheless with high probability, the imbalance in the load distribution does not increase. Furthermore, if the fourth moment of the weight distribution is finite, the expected value of the gap is shown to be independent of the number of balls. 1 1
The (1 + β)Choice Process and Weighted BallsintoBins
"... Suppose m balls are sequentially thrown into n bins where each ball goes into a random bin. It is wellknown that the gap between the load of the most loaded bin m log n and the average is Θ( n), for large m. If each ball goes to the lesser loaded of two random bins, this gap dramatically reduces to ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Suppose m balls are sequentially thrown into n bins where each ball goes into a random bin. It is wellknown that the gap between the load of the most loaded bin m log n and the average is Θ( n), for large m. If each ball goes to the lesser loaded of two random bins, this gap dramatically reduces to Θ(log log n) independent of m. Consider now the following “(1 + β)choice ” process for some parameter β ∈ (0, 1): each ball goes to a random bin with probability (1−β) and the lesser loaded of two random bins with probability β. How does the gap for such a process behave? Suppose that the weight of each ball was drawn from a geometric distribution. How is the gap (now defined in terms of weight) affected? In this work, we develop general techniques for analyzing such ballsintobins processes. Specifically, we show that for the (1 + β)choice process above, the gap is Θ(log n/β), irrespective of m. Moreover the gap stays at Θ(log n/β) in the weighted case for a large class of weight distributions. No nontrivial explicit bounds were previously known in the weighted case, even for the 2choice paradigm. 1
Kinesis: A new approach to replica placement in distributed storage systems
 ACM Transactions on Storage (TOS
"... Kinesis is a novel data placement model for distributed storage systems. It exemplifies three design principles: structure (division of servers into a few failureisolated segments), freedom of choice (freedom to allocate the best servers to store and retrieve data based on current resource availabi ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Kinesis is a novel data placement model for distributed storage systems. It exemplifies three design principles: structure (division of servers into a few failureisolated segments), freedom of choice (freedom to allocate the best servers to store and retrieve data based on current resource availability), and scattered distribution (independent, pseudorandom spread of replicas in the system). These design principles enable storage systems to achieve balanced utilization of storage and network resources in the presence of incremental system expansions, failures of single and shared components, and skewed distributions of data size and popularity. In turn, this ability leads to significantly reduced resource provisioning costs, good userperceived response times, and fast, parallelized recovery from independent and correlated failures. This paper validates Kinesis through theoretical analysis, simulations, and experiments on a prototype implementation. Evaluations driven by realworld traces show that Kinesis can significantly outperform the widelyused Chain replicaplacement strategy in terms of resource requirements, endtoend delay, and failure recovery.
Balanced Relay Allocation on Heterogeneous Unstructured Overlays
"... Abstract — Due to the increased usage of NAT boxes and firewalls, it has become harder for applications to establish direct connections seamlessly among two endhosts. A recently adopted proposal to mitigate this problem is to use relay nodes, endhosts that act as intermediary points to bridge conn ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract — Due to the increased usage of NAT boxes and firewalls, it has become harder for applications to establish direct connections seamlessly among two endhosts. A recently adopted proposal to mitigate this problem is to use relay nodes, endhosts that act as intermediary points to bridge connections. Efficiently selecting a relay node is not a trivial problem, specially in a largescale unstructured overlay system where endhosts are heterogeneous. In such environment, heterogeneity among the relay nodes comes from the inherent differences in their capacities and from the way overlay networks are constructed. Despite this fact, good relay selection algorithms should effectively balance the aggregate load across the set of relay nodes. In this paper, we address this problem using algorithms based on the two random choices method. We first prove that the classic loadbased algorithm can effectively balance the load even when relays are heterogeneous, and that its performance depends directly on relay heterogeneity. Second, we propose an utilizationbased random choice algorithm to distribute load in order to balance relay utilization. Numerical evaluations through simulations illustrate the effectiveness of this algorithm, indicating that it might also yield provable performance (which we conjecture). Finally, we support our theoretical findings through simulations of various largescale scenarios, with realistic relay heterogeneity. I.
Balls and Bins with Structure: Balanced Allocations on Hypergraphs
"... In the standard ballsandbins model of balanced allocations, m balls are placed sequentially into n bins. Each ball chooses d uniformrandom bins and is placed in the least loaded bin. It is well known that when d = log Θ(1) n, after placing m = n balls, the maximum load (number of balls in a bin) ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
In the standard ballsandbins model of balanced allocations, m balls are placed sequentially into n bins. Each ball chooses d uniformrandom bins and is placed in the least loaded bin. It is well known that when d = log Θ(1) n, after placing m = n balls, the maximum load (number of balls in a bin) is Θ(1) w.h.p. In this paper we show that as long as d = Ω(log n), independent random choices are not necessary to achieve a constant load balance: these choices may be structured in a very general way. Specifically, we allow each ball i to have an associated random set of bins Bi. We require that Bi  = Ω(log n) and that bins are included in Bi with approximately the same probability; but the distributions of the Bis are otherwise arbitrary, so that there may be correlations in the choice of bins. We show that this model captures structure important to two applications, nearby server selection and load balance in distributed hash tables. 1
Efficient setintersection with simulationbased security
 In Journal of Cryptology
, 2013
"... We consider the problem of computing the intersection of private datasets of two parties, where the datasets contain lists of elements taken from a large domain. This problem has many applications for online collaboration. In this work we present protocols based on the use of homomorphic encryption ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We consider the problem of computing the intersection of private datasets of two parties, where the datasets contain lists of elements taken from a large domain. This problem has many applications for online collaboration. In this work we present protocols based on the use of homomorphic encryption and different hashing schemes for both the semihonest and malicious environments. The protocol for the semihonest environment is secure in the standard model, while the protocol for the malicious environment is secure in the random oracle model. Our protocols obtain linear communication and computation overhead. We further implement different variants of our semihonest protocol. Our experiments show that the asymptotic overhead of the protocol is affected by different constants. (In particular, the degree of the polynomials evaluated by the protocol matters less than the number of polynomials that are evaluated.) As a result, the protocol variant with the best asymptotic overhead is not necessarily preferable for inputs of reasonable size. 1
DISS. ETH NO. 19459 Synchronization and Symmetry Breaking in Distributed Systems
, 2010
"... accepted on the recommendation of ..."
Distributed Computing manuscript No. (will be inserted by the editor) Tight Bounds for Parallel Randomized Load Balancing
"... Abstract Given a distributed system of n balls and n bins, how evenly can we distribute the balls to the bins, minimizing communication? The fastest nonadaptive and symmetric algorithm achieving a constant maximum bin load requires Θ(log log n) rounds, and any such algorithm running for r ∈ O(1) r ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Given a distributed system of n balls and n bins, how evenly can we distribute the balls to the bins, minimizing communication? The fastest nonadaptive and symmetric algorithm achieving a constant maximum bin load requires Θ(log log n) rounds, and any such algorithm running for r ∈ O(1) rounds incurs a bin load of Ω((log n / log log n)1/r). In this work, we explore the fundamental limits of the general problem. We present a simple adaptive symmetric algorithm that achieves a bin load of 2 in log ∗ n + O(1) communication rounds using O(n) messages in total. Our main result, however, is a matching lower bound of (1 − o(1)) log ∗ n on the time complexity of symmetric algorithms that guarantee small bin loads. The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls need not be globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time. An extended abstract of preliminary work appeared at STOC 2011 [24] and the corresponding article has been published on arxiv [23].