Results 1 
8 of
8
Balanced Allocations (Extended Abstract)
 SIAM Journal on Computing
, 1994
"... Suppose that we sequentially place n balls into n boxes by putting each ball into a randomly chosen box. It is well known that when we are done, the fullest box has with high probability ln n= ln ln n(1 + o(1)) balls in it. Suppose instead, that for each ball we choose two boxes at random and place ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
Suppose that we sequentially place n balls into n boxes by putting each ball into a randomly chosen box. It is well known that when we are done, the fullest box has with high probability ln n= ln ln n(1 + o(1)) balls in it. Suppose instead, that for each ball we choose two boxes at random and place the ball into the one which is less full at the time of placement. We show that with high probability, the fullest box contains only ln ln n= ln 2+O(1) balls  exponentially less than before. Furthermore, we show that a similar gap exists in the infinite process, where at each step one ball, chosen uniformly at random, is deleted, and one ball is added in the manner above. We discuss consequences of this and related theorems for dynamic resource allocation, hashing, and online load balancing. 1 Introduction Suppose that we sequentially place n balls into n boxes by putting each ball into a randomly chosen box. Properties of this random allocation process have been extensively studied in ...
Analyzing an Infinite Parallel Job Allocation Process
"... In recent years the task of allocating jobs to servers has been studied with the "balls and bins" abstraction. Results in this area exploit the large decrease in maximum load that can be achieved by allowing each job (ball) a very small amount of choice in choosing its destination server (bin). T ..."
Abstract

Cited by 12 (7 self)
 Add to MetaCart
In recent years the task of allocating jobs to servers has been studied with the "balls and bins" abstraction. Results in this area exploit the large decrease in maximum load that can be achieved by allowing each job (ball) a very small amount of choice in choosing its destination server (bin). The scenarios considered can be divided into two categories: sequential, where each job can be placed at a server before the next job arrives, and parallel, where the jobs arrive in large batches that must be dealt with simultaneously. Another, orthogonal, classification of load balancing scenarios is into fixed time and infinite. Fixed time processes are only analyzed for an interval of time that is known in advance, and for all such results thus far either the number of rounds or the total expected number of arrivals at each server is a constant. In the infinite case, there is an arrival process and a deletion process that are both defined over an infinite time line. In this pape...
Allocating Weighted Jobs in Parallel
, 1997
"... It is well known that after placing m n balls independently and uniformly at random (i.u.r.) into n bins, the fullest bin contains \Theta(log n= log log n+ m n ) balls, with high probability. It is also known (see [Ste96]) that a maximum load of O \Gamma m n \Delta can be obtained for all m n ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
It is well known that after placing m n balls independently and uniformly at random (i.u.r.) into n bins, the fullest bin contains \Theta(log n= log log n+ m n ) balls, with high probability. It is also known (see [Ste96]) that a maximum load of O \Gamma m n \Delta can be obtained for all m n if a ball is allocated in one (suitably chosen) of two (i.u.r.) bins. Stemann ([Ste96]) shows that r communication rounds suffice to guarantee a maximum load of maxf r p log n; O \Gamma m n \Delta g, with high probability. Adler et al. have shown in [ACMR95] that Stemanns protocol is optimal for constant r. In this paper we extend the above results in two directions: We generalize the lower bound to arbitrary r log log n. This implies that the result of Stemanns protocol is optimal for all r. Our main result is a generalization of Stemanns upper bound to weighted jobs: Let W A (W M ) denote the average (maximum) weight of the balls. Further let \Delta = W A =W M . Note that...
Ballanced allocations with heterogeneous bins
 In Proceedings of the Sympostiom on Parallel Algorithms and Architecture (SPAA
"... Ballsintobins processes are a useful and common abstraction for many loadbalancing related problems. A well known paradigm for load balancing in distributed or parallel servers is the ”multiple choice paradigm ” where an item (ball) is put in the less loaded out of d uniformly chosen servers (bin ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Ballsintobins processes are a useful and common abstraction for many loadbalancing related problems. A well known paradigm for load balancing in distributed or parallel servers is the ”multiple choice paradigm ” where an item (ball) is put in the less loaded out of d uniformly chosen servers (bins). In many applications however the uniformity of the sampling probability is not guaranteed. If the system is heterogenous or dynamic it may be the case that some bins are sampled with a higher probability than others. We investigate the power of the multiple choice paradigm in the setting where bins are not sampled from the uniform distribution. Byers et al [5] showed that a logarithmic imbalance in the sampling probability could be tolerated, as long as the number of balls is linear in the number of bins. We show that if the number of balls is much larger than the number of bins, this ceases to be the case. Given a probability over bins, we prove tight upper and lower bounds for the number of choices needed in the 1outofd scheme in order to maintain a balanced allocations when the number of items is arbitrarily high.
The (1 + β)Choice Process and Weighted BallsintoBins
"... Suppose m balls are sequentially thrown into n bins where each ball goes into a random bin. It is wellknown that the gap between the load of the most loaded bin m log n and the average is Θ( n), for large m. If each ball goes to the lesser loaded of two random bins, this gap dramatically reduces to ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Suppose m balls are sequentially thrown into n bins where each ball goes into a random bin. It is wellknown that the gap between the load of the most loaded bin m log n and the average is Θ( n), for large m. If each ball goes to the lesser loaded of two random bins, this gap dramatically reduces to Θ(log log n) independent of m. Consider now the following “(1 + β)choice ” process for some parameter β ∈ (0, 1): each ball goes to a random bin with probability (1−β) and the lesser loaded of two random bins with probability β. How does the gap for such a process behave? Suppose that the weight of each ball was drawn from a geometric distribution. How is the gap (now defined in terms of weight) affected? In this work, we develop general techniques for analyzing such ballsintobins processes. Specifically, we show that for the (1 + β)choice process above, the gap is Θ(log n/β), irrespective of m. Moreover the gap stays at Θ(log n/β) in the weighted case for a large class of weight distributions. No nontrivial explicit bounds were previously known in the weighted case, even for the 2choice paradigm. 1
Optimal Speedup on a LowDegree MultiCore Parallel Architecture (LoPRAM)
, 2008
"... Over the last five years, major microprocessor manufacturers have released plans for a rapidly increasing number of cores per microprossesor, with upwards of 64 cores by 2015. In this setting, a sequential RAM computer will no longer accurately reflect the architecture on which algorithms are being ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Over the last five years, major microprocessor manufacturers have released plans for a rapidly increasing number of cores per microprossesor, with upwards of 64 cores by 2015. In this setting, a sequential RAM computer will no longer accurately reflect the architecture on which algorithms are being executed. In this paper we propose a model of low degree parallelism (LoPRAM) which builds upon the RAM and PRAM models yet better reflects recent advances in parallel (multicore) architectures. This model supports a high level of abstraction that simplifies the design and analysis of parallel programs. More importantly we show that in many instances it naturally leads to workoptimal parallel algorithms via simple modifications to sequential algorithms.
Simple Competitive Request Scheduling Strategies
 in 11th ACM Symposium on Parallel Architectures and Algorithms
, 1999
"... In this paper we study the problem of scheduling realtime requests in distributed data servers. We assume the time to be divided into time steps of equal length called rounds. During every round a set of requests arrives at the system, and every resource is able to fulfill one request per round. Ev ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper we study the problem of scheduling realtime requests in distributed data servers. We assume the time to be divided into time steps of equal length called rounds. During every round a set of requests arrives at the system, and every resource is able to fulfill one request per round. Every request specifies two (distinct) resources and requires to get access to one of them. Furthermore, every request has a deadline of d, i.e. a request that arrives in round t has to be fulfilled during round t +d 1 at the latest. The number of requests which arrive during some round and the two alternative resources of every request are selected by an adversary. The goal is to maximize the number of requests that are fulfilled before their deadlines expire. We examine the scheduling problem in an online setting, i.e. new requests continuously arrive at the system, and we have to determine online an assignment of the requests to the resources in such a way that every resource has to fulfil...
Cost study of different pivoting strategies on the BSP Model
"... In this work the BSP (Bulk Synchronous Parallel) Model is considered as an effective parallel computing approach capable of producing efficient algorithms which can be executed on whatever parallel environment. The analysis is focused on the theoretical basis provided by the model to predict paralle ..."
Abstract
 Add to MetaCart
In this work the BSP (Bulk Synchronous Parallel) Model is considered as an effective parallel computing approach capable of producing efficient algorithms which can be executed on whatever parallel environment. The analysis is focused on the theoretical basis provided by the model to predict parallel algorithms running times on a given parallel architecture. Different parallel algorithms for performing gaussian elimination have been considered, varying the distribution of data among the processors as well as the pivoting strategy used. A theoretical study of the cost of each one of these algorithms under the assumptions of the BSP model is reported and compared with the experimental results obtained on an IBM SP2 parallel system. 1 Introduction During 50 years sequential computing has been the normal way of computation. There is a wide consensus in the belief that the success of sequential computing is due to the almost universal adoption of the von Neumann model [4]. The stability a...