Results 1  10
of
33
The Power of Two Random Choices: A Survey of Techniques and Results
 in Handbook of Randomized Computing
, 2000
"... ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately ..."
Abstract

Cited by 100 (2 self)
 Add to MetaCart
ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately log n= log log n with high probability. Now suppose instead that the balls are placed sequentially, and each ball is placed in the least loaded of d 2 bins chosen independently and uniformly at random. Azar, Broder, Karlin, and Upfal showed that in this case, the maximum load is log log n= log d + (1) with high probability [ABKU99]. The important implication of this result is that even a small amount of choice can lead to drastically different results in load balancing. Indeed, having just two random choices (i.e.,...
"Balls into bins”  A simple and tight analysis
 LECTURE NOTES IN COMPUTER SCIENCE
, 1998
"... Suppose we sequentially throw m balls into n bins. It is a natural question to ask for the maximum number of balls in any bin. In this paper we shall derive sharp upper and lower bounds which are reached with high probability. We prove bounds for all values of m(n) >= n/polylog(n) by using the s ..."
Abstract

Cited by 77 (2 self)
 Add to MetaCart
Suppose we sequentially throw m balls into n bins. It is a natural question to ask for the maximum number of balls in any bin. In this paper we shall derive sharp upper and lower bounds which are reached with high probability. We prove bounds for all values of m(n) >= n/polylog(n) by using the simple and wellknown method of the first and second moment.
Parallel Randomized Load Balancing
 In Symposium on Theory of Computing. ACM
, 1995
"... It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the ..."
Abstract

Cited by 57 (8 self)
 Add to MetaCart
It is well known that after placing n balls independently and uniformly at random into n bins, the fullest bin holds \Theta(log n= log log n) balls with high probability. Recently, Azar et al. analyzed the following: randomly choose d bins for each ball, and then sequentially place each ball in the least full of its chosen bins [2]. They show that the fullest bin contains only log log n= log d + \Theta(1) balls with high probability. We explore extensions of this result to parallel and distributed settings. Our results focus on the tradeoff between the amount of communication and the final load. Given r rounds of communication, we provide lower bounds on the maximum load of \Omega\Gamma r p log n= log log n) for a wide class of strategies. Our results extend to the case where the number of rounds is allowed to grow with n. We then demonstrate parallelizations of the sequential strategy presented in Azar et al. that achieve loads within a constant factor of the lower bound for two ...
Fast Concurrent Access to Parallel Disks
"... High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is ..."
Abstract

Cited by 51 (12 self)
 Add to MetaCart
High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for the efficient adaptation of singledisk external memory algorithms to multiple disks. We solve this problem for arbitrary access patterns by randomly mapping blocks of a logical address space to the disks. We show that a shared buffer of O(D) blocks suffices to support efficient writing. The analysis uses the properties of negative association to handle dependencies between the random variables involved. This approach might be of independent interest for probabilistic analysis in general. If two randomly allocated copies of each block exist, N arbitrary blocks can be read within dN=De + 1 I/O steps with high probability. The redundancy can be further reduced from 2 to 1 + 1=r for any integer r without a big impact on reading efficiency. From the point of view of external memory models, these results rehabilitate Aggarwal and Vitter's "singledisk multihead" model [1] that allows access to D arbitrary blocks in each I/O step. This powerful model can be emulated on the physically more realistic independent disk model [2] with small constant overhead factors. Parallel disk external memory algorithms can therefore be developed in the multihead model first. The emulation result can then be applied directly or further refinements can be added.
Optimizing Result Prefetching in Web Search Engines with Segmented Indices
 In VLDB
, 2001
"... We study the process in which search engines with segmented indices serve queries. In particular, we investigate the number of result pages which search engines should prepare during the query processing phase. Search engine users have been observed to browse through very few pages of results for qu ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
We study the process in which search engines with segmented indices serve queries. In particular, we investigate the number of result pages which search engines should prepare during the query processing phase. Search engine users have been observed to browse through very few pages of results for queries which they submit. This behavior of users suggests that prefetching many results upon processing an initial query is not efficient, since most of the prefetched results will not be requested by the user who initiated the search. However, a policy which abandons result prefetching in favor of retrieving just the first page of search results might not make optimal use of system resources as well. We argue that for a certain behavior of users, engines should prefetch a constant number of result pages per query. We define a concrete query processing model for search engines with segmented indices, and analyze the cost of such prefetching policies. Based on these costs, we show how to determine the constant which optimizes the prefetching policy. Our results are mostly applicable to local index partitions of the inverted files, but are also applicable to processing of short queries in global index architectures.
The natural workstealing algorithm is stable
 In Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science (FOCS
, 2001
"... In this paper we analyse a very simple dynamic workstealing algorithm. In the workgeneration model, there are n (work) generators. A generatorallocation function is simply a function from the n generators to the n processors. We consider a fixed, but arbitrary, distribution D over generatoralloca ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
In this paper we analyse a very simple dynamic workstealing algorithm. In the workgeneration model, there are n (work) generators. A generatorallocation function is simply a function from the n generators to the n processors. We consider a fixed, but arbitrary, distribution D over generatorallocation functions. During each timestep of our process, a generatorallocation function h is chosen from D, and the generators are allocated to the processors according to h. Each generator may then generate a unittime task which it inserts into the queue of its host processor. It generates such a task independently with probability λ. After the new tasks are generated, each processor removes one task from its queue and services it. For many choices of D, the workgeneration model allows the load to become arbitrarily imbalanced, even when λ < 1. For example, D could be the point distribution containing a single function h which allocates all of the generators to just one processor. For this choice of D, the chosen processor receives around λn units of work at each step and services one. The natural workstealing algorithm that we analyse is widely used in practical applications and works as follows. During each time step, each empty
On Balls and Bins with Deletions
 In Proc. of the RANDOM'98
, 1998
"... Microsystems. The views and conclusions contained here are those of the authors and should not be interpreted as necessarily representing the official policies or ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
Microsystems. The views and conclusions contained here are those of the authors and should not be interpreted as necessarily representing the official policies or
Cuckoo hashing: Further analysis
, 2003
"... We consider cuckoo hashing as proposed by Pagh and Rodler in 2001. We show that the expected construction time of the hash table is O(n) as long as the two open addressing tables are each of size at least (1 #)n,where#>0andn is the number of data points. Slightly improved bounds are obtained f ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
We consider cuckoo hashing as proposed by Pagh and Rodler in 2001. We show that the expected construction time of the hash table is O(n) as long as the two open addressing tables are each of size at least (1 #)n,where#>0andn is the number of data points. Slightly improved bounds are obtained for various probabilities and constraints. The analysis rests on simple properties of branching processes.
Load Balancing with Memory
 In Proc. of the 43rd IEEE Symp. on Foundations of Computer Science (FOCS
, 2002
"... A standard load b lancing model considers placing n b lls into nb[K y choosing d possib: locations for eachb all independently and uniformly at random and sequentially placing each in the least loaded of its chosenb ins. It is well known that allowing just a small amount of choice (d = 2) greatly ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
A standard load b lancing model considers placing n b lls into nb[K y choosing d possib: locations for eachb all independently and uniformly at random and sequentially placing each in the least loaded of its chosenb ins. It is well known that allowing just a small amount of choice (d = 2) greatly improves performance over random placement (d = 1). In this paper, we show that similar performance gains occurb y introducing memory. We focus on the situation where each time ab all is placed, the least loaded of that b ll's choices after placement is rememb ered and used as one of the possibH choices for the nextb all. For example, we show that when eachb all gets just one random choice,be can also choose theb est of the lastb all's choices, the maximum numb er ofb alls in ab in is log log n/2log# +O(1) with high pro bbU:: y, where # =(1+ # 5)/2 is the golden ratio. The asymptotic performance is thereforeb etter with one random choice and one choice from memory than with two fresh random choices for eachb all; the performance with memory asymptotically matches the asymmetric policy using two choices introducedb y Vocking. More generally, we find that a small amount of memory, like a small amount of choice, can dramatically improve the loadb alancing performance. We also investigate continuous time variations corresponding to queueing systems, where we find similar results. 1
Allocating Weighted Jobs in Parallel
, 1997
"... It is well known that after placing m n balls independently and uniformly at random (i.u.r.) into n bins, the fullest bin contains \Theta(log n= log log n+ m n ) balls, with high probability. It is also known (see [Ste96]) that a maximum load of O \Gamma m n \Delta can be obtained for all m n ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
It is well known that after placing m n balls independently and uniformly at random (i.u.r.) into n bins, the fullest bin contains \Theta(log n= log log n+ m n ) balls, with high probability. It is also known (see [Ste96]) that a maximum load of O \Gamma m n \Delta can be obtained for all m n if a ball is allocated in one (suitably chosen) of two (i.u.r.) bins. Stemann ([Ste96]) shows that r communication rounds suffice to guarantee a maximum load of maxf r p log n; O \Gamma m n \Delta g, with high probability. Adler et al. have shown in [ACMR95] that Stemanns protocol is optimal for constant r. In this paper we extend the above results in two directions: We generalize the lower bound to arbitrary r log log n. This implies that the result of Stemanns protocol is optimal for all r. Our main result is a generalization of Stemanns upper bound to weighted jobs: Let W A (W M ) denote the average (maximum) weight of the balls. Further let \Delta = W A =W M . Note that...