Results 1  10
of
152
Efficient erasure correcting codes
 IEEE Transactions on Information Theory
, 2001
"... Abstract—We introduce a simple erasure recovery algorithm for codes derived from cascades of sparse bipartite graphs and analyze the algorithm by analyzing a corresponding discretetime random process. As a result, we obtain a simple criterion involving the fractions of nodes of different degrees on ..."
Abstract

Cited by 254 (20 self)
 Add to MetaCart
Abstract—We introduce a simple erasure recovery algorithm for codes derived from cascades of sparse bipartite graphs and analyze the algorithm by analyzing a corresponding discretetime random process. As a result, we obtain a simple criterion involving the fractions of nodes of different degrees on both sides of the graph which is necessary and sufficient for the decoding process to finish successfully with high probability. By carefully designing these graphs we can construct for any given rate and any given real number a family of linear codes of rate which can be encoded in time proportional to ��@I A times their block length. Furthermore, a codeword can be recovered with high probability from a portion of its entries of length @IC A or more. The recovery algorithm also runs in time proportional to ��@I A. Our algorithms have been implemented and work well in practice; various implementation issues are discussed. Index Terms—Erasure channel, large deviation analysis, lowdensity paritycheck codes. I.
Accessing multiple mirror sites in parallel: using tornado codes tospeed up downloads
 in INFOCOM ’99.Eighteenth AnnualJoint Conference oftheIEEEComputer andCommunications Societies. Proceedings. IEEE,vol.1,NewYork,NY,USA,Mar. 21–25,1999,pp.275–283
"... Abstracr Mirror sites enable client requests to be serviced by any of a number of servers, reducing load at individual servers and dispersing network load. Typically, a client requests service from a single mirror site. We consider enabling a client to access a file from multiple mirror sites in pa ..."
Abstract

Cited by 159 (18 self)
 Add to MetaCart
Abstracr Mirror sites enable client requests to be serviced by any of a number of servers, reducing load at individual servers and dispersing network load. Typically, a client requests service from a single mirror site. We consider enabling a client to access a file from multiple mirror sites in parallel to speed up the download. To eliminate complex clientserver negotiations that a straightforward implementation of this approach would require, we develop a feedbackfree protocol based on erasure codes. We demonstrate that a protocol using fast Tornado codes can deliver dramatic speedups at the expense of transmitting a moderate number of additional packets into the network. Our scalable solution extends naturally to allow multiple clients to access data from multiple mirror sites simultaneously. Our approach applies naturally to wireless networks and satellite networks as well. I.
WebOS: Operating System Services for Wide Area Applications
"... In this paper, we demonstrate the power of providing a common set of Operating System services to widearea applications, including mechanisms for naming, persistent storage, remote process execution, resource management, authentication, and security. On a single machine, application developers can ..."
Abstract

Cited by 113 (16 self)
 Add to MetaCart
In this paper, we demonstrate the power of providing a common set of Operating System services to widearea applications, including mechanisms for naming, persistent storage, remote process execution, resource management, authentication, and security. On a single machine, application developers can rely on the local operating system to provide these abstractions. In the wide area, however, application developers are forced to build these abstractions themselves or to do without. This adhoc approach often results in individual programmers implementing nonoptimal solutions, wasting both programmer effort and system resources. To address these problems, we are building a system, WebOS, that provides basic operating systems services needed to build applications that are geographically distributed, highly available, incrementally scalable, and dynamically reconfigurable. Experience with a number of applications developed under WebOS indicates that it simplifies system development and improves resource utilization. In particular, we use WebOS to implement RentAServer to provide dynamic replication of overloaded Web services across the wide area in response to client demands.
The Power of Two Random Choices: A Survey of Techniques and Results
 in Handbook of Randomized Computing
, 2000
"... ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately ..."
Abstract

Cited by 99 (2 self)
 Add to MetaCart
ITo motivate this survey, we begin with a simple problem that demonstrates a powerful fundamental idea. Suppose that n balls are thrown into n bins, with each ball choosing a bin independently and uniformly at random. Then the maximum load, or the largest number of balls in any bin, is approximately log n= log log n with high probability. Now suppose instead that the balls are placed sequentially, and each ball is placed in the least loaded of d 2 bins chosen independently and uniformly at random. Azar, Broder, Karlin, and Upfal showed that in this case, the maximum load is log log n= log d + (1) with high probability [ABKU99]. The important implication of this result is that even a small amount of choice can lead to drastically different results in load balancing. Indeed, having just two random choices (i.e.,...
How Asymmetry Helps Load Balancing
 Journal of the ACM
, 1999
"... This paper deals with balls and bins processes related to randomized load balancing, dynamic resource allocation, and hashing. Suppose ¡ balls have to be assigned to ¡ bins, where each ball has to be placed without knowledge about the distribution of previously placed balls. The goal is to achieve a ..."
Abstract

Cited by 82 (4 self)
 Add to MetaCart
This paper deals with balls and bins processes related to randomized load balancing, dynamic resource allocation, and hashing. Suppose ¡ balls have to be assigned to ¡ bins, where each ball has to be placed without knowledge about the distribution of previously placed balls. The goal is to achieve an allocation that is as even as possible so that no bin gets much more balls than the average. A well known and good solution for this problem is to choose ¢ possible locations for each ball at random, to look into each of these bins, and to place the ball into the least full among these bins. This class of algorithms has been investigated intensively in the past, but almost all previous analyses assume that the ¢ locations for each ball are chosen uniformly and independently at random from the set of all bins. We investigate whether a nonuniform and possibly dependent choice of the ¢
How Useful Is Old Information
 IEEE Transactions on Parallel and Distributed Systems
, 2000
"... AbstractÐWe consider the problem of load balancing in dynamic distributed systems in cases where new incoming tasks can make use of old information. For example, consider a multiprocessor system where incoming tasks with exponentially distributed service requirements arrive as a Poisson process, the ..."
Abstract

Cited by 80 (10 self)
 Add to MetaCart
AbstractÐWe consider the problem of load balancing in dynamic distributed systems in cases where new incoming tasks can make use of old information. For example, consider a multiprocessor system where incoming tasks with exponentially distributed service requirements arrive as a Poisson process, the tasks must choose a processor for service, and a task knows when making this choice the processor queue lengths from T seconds ago. What is a good strategy for choosing a processor in order for tasks to minimize their expected time in the system? Such models can also be used to describe settings where there is a transfer delay between the time a task enters a system and the time it reaches a processor for service. Our models are based on considering the behavior of limiting systems where the number of processors goes to infinity. The limiting systems can be shown to accurately describe the behavior of sufficiently large systems and simulations demonstrate that they are reasonably accurate even for systems with a small number of processors. Our studies of specific models demonstrate the importance of using randomness to break symmetry in these systems and yield important rules of thumb for system design. The most significant result is that only small amounts of queue length information can be extremely useful in these settings; for example, having incoming tasks choose the least loaded of two randomly chosen processors is extremely effective over a large range of possible system parameters. In contrast, using global information can actually degrade performance unless used carefully; for example, unlike most settings where the load information is current, having tasks go to the apparently least loaded server can significantly hurt performance. Index TermsÐLoad balancing, stale information, old information, queuing theory, large deviations. æ 1
balls into bins” — A simple and tight analysis
 Lecture Notes in Computer Science
, 1998
"... Abstract. Suppose we sequentially throw m balls into n bins. It is a natural question to ask for the maximum number of balls in any bin. In this paper we shall derive sharp upper and lower bounds which are reached with high probability. We prove bounds for all values of m(n) n=polylog(n) by using th ..."
Abstract

Cited by 76 (2 self)
 Add to MetaCart
Abstract. Suppose we sequentially throw m balls into n bins. It is a natural question to ask for the maximum number of balls in any bin. In this paper we shall derive sharp upper and lower bounds which are reached with high probability. We prove bounds for all values of m(n) n=polylog(n) by using the simple and wellknown method of the rst and second moment. 1
Using Multiple Hash Functions to Improve IP Lookups
 IN PROCEEDINGS OF IEEE INFOCOM
, 2000
"... High performance Internet routers require a mechanism for very efficient IP address lookups. Some techniques used to this end, such as binary search on levels, need to construct quickly a good hash table for the appropriate IP prefixes. In this paper we describe an approach for obtaining good hash ..."
Abstract

Cited by 68 (11 self)
 Add to MetaCart
High performance Internet routers require a mechanism for very efficient IP address lookups. Some techniques used to this end, such as binary search on levels, need to construct quickly a good hash table for the appropriate IP prefixes. In this paper we describe an approach for obtaining good hash tables based on using multiple hashes of each input key (which is an IP address). The methods we describe are fast, simple, scalable, parallelizable, and flexible. In particular, in instances where the goal is to have one hash bucket fit into a cache line, using multiple hashes proves extremely suitable. We provide a general analysis of this hashing technique and specifically discuss its application to binary search on levels.
Comparing Random Data Allocation and Data Striping in Multimedia Servers
 In ACM SIGMETRICS
, 2000
"... We compare performance of the RIO (Randomized I/O) Multimedia Storage Server which is based on random data allocation and block replication with traditional data striping techniques. We compare both approaches in terms of maximum supported data rate and stream cost. Data striping techniques in multi ..."
Abstract

Cited by 59 (1 self)
 Add to MetaCart
We compare performance of the RIO (Randomized I/O) Multimedia Storage Server which is based on random data allocation and block replication with traditional data striping techniques. We compare both approaches in terms of maximum supported data rate and stream cost. Data striping techniques in multimedia servers are often designed for restricted workloads. e.g. sequential access patterns with CBR (constant bit rate) requirements. On other hand, RIO is designed to support virtually any type of multimedia application, including VBR (variable bit rate) video or audio, and interactive applications with unpredictable access patterns, such as 3D interactive virtual worlds, interactive scientific visualizations, etc. Surprisingly, our results show that system performance with random data allocation is competitive and sometimes even better than with data striping techniques, for the workloads for which data striping is designed to work best; i.e. streams with sequential access patterns and CBR...
BALANCED ALLOCATIONS: THE HEAVILY LOADED CASE
, 2006
"... We investigate ballsintobins processes allocating m balls into n bins based on the multiplechoice paradigm. In the classical singlechoice variant each ball is placed into a bin selected uniformly at random. In a multiplechoice process each ball can be placed into one out of d ≥ 2 randomly selec ..."
Abstract

Cited by 58 (7 self)
 Add to MetaCart
We investigate ballsintobins processes allocating m balls into n bins based on the multiplechoice paradigm. In the classical singlechoice variant each ball is placed into a bin selected uniformly at random. In a multiplechoice process each ball can be placed into one out of d ≥ 2 randomly selected bins. It is known that in many scenarios having more than one choice for each ball can improve the load balance significantly. Formal analyses of this phenomenon prior to this work considered mostly the lightly loaded case, that is, when m ≈ n. In this paper we present the first tight analysis in the heavily loaded case, that is, when m ≫ n rather than m ≈ n. The best previously known results for the multiplechoice processes in the heavily loaded case were obtained using majorization by the singlechoice process. This yields an upper bound of the maximum load of bins of m/n + O ( √ m ln n/n) with high probability. We show, however, that the multiplechoice processes are fundamentally different from the singlechoice variant in that they have “short memory. ” The great consequence of this property is that the deviation of the multiplechoice processes from the optimal allocation (that is, the allocation in which each bin has either ⌊m/n ⌋ or ⌈m/n ⌉ balls) does not increase with the number of balls as in the case of the singlechoice process. In particular, we investigate the allocation obtained by two different multiplechoice allocation schemes,