Results 1  10
of
50
Efficient erasure correcting codes
 IEEE Transactions on Information Theory
, 2001
"... Abstract—We introduce a simple erasure recovery algorithm for codes derived from cascades of sparse bipartite graphs and analyze the algorithm by analyzing a corresponding discretetime random process. As a result, we obtain a simple criterion involving the fractions of nodes of different degrees on ..."
Abstract

Cited by 254 (20 self)
 Add to MetaCart
Abstract—We introduce a simple erasure recovery algorithm for codes derived from cascades of sparse bipartite graphs and analyze the algorithm by analyzing a corresponding discretetime random process. As a result, we obtain a simple criterion involving the fractions of nodes of different degrees on both sides of the graph which is necessary and sufficient for the decoding process to finish successfully with high probability. By carefully designing these graphs we can construct for any given rate and any given real number a family of linear codes of rate which can be encoded in time proportional to ��@I A times their block length. Furthermore, a codeword can be recovered with high probability from a portion of its entries of length @IC A or more. The recovery algorithm also runs in time proportional to ��@I A. Our algorithms have been implemented and work well in practice; various implementation issues are discussed. Index Terms—Erasure channel, large deviation analysis, lowdensity paritycheck codes. I.
A Brief History of Generative Models for Power Law and Lognormal Distributions
 INTERNET MATHEMATICS
"... Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying ..."
Abstract

Cited by 252 (7 self)
 Add to MetaCart
Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying
The Power of Two Choices in Randomized Load Balancing
 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1996
"... Suppose that n balls are placed into n bins, each ball being placed into a bin chosen independently and uniformly at random. Then, with high probability, the maximum load in any bin is approximately log n log log n . Suppose instead that each ball is placed sequentially into the least full of d ..."
Abstract

Cited by 201 (23 self)
 Add to MetaCart
Suppose that n balls are placed into n bins, each ball being placed into a bin chosen independently and uniformly at random. Then, with high probability, the maximum load in any bin is approximately log n log log n . Suppose instead that each ball is placed sequentially into the least full of d bins chosen independently and uniformly at random. It has recently been shown that the maximum load is then only log log n log d +O(1) with high probability. Thus giving each ball two choices instead of just one leads to an exponential improvement in the maximum load. This result demonstrates the power of two choices, and it has several applications to load balancing in distributed systems. In this thesis, we expand upon this result by examining related models and by developing techniques for stu...
How Useful Is Old Information
 IEEE Transactions on Parallel and Distributed Systems
, 2000
"... AbstractÐWe consider the problem of load balancing in dynamic distributed systems in cases where new incoming tasks can make use of old information. For example, consider a multiprocessor system where incoming tasks with exponentially distributed service requirements arrive as a Poisson process, the ..."
Abstract

Cited by 80 (10 self)
 Add to MetaCart
AbstractÐWe consider the problem of load balancing in dynamic distributed systems in cases where new incoming tasks can make use of old information. For example, consider a multiprocessor system where incoming tasks with exponentially distributed service requirements arrive as a Poisson process, the tasks must choose a processor for service, and a task knows when making this choice the processor queue lengths from T seconds ago. What is a good strategy for choosing a processor in order for tasks to minimize their expected time in the system? Such models can also be used to describe settings where there is a transfer delay between the time a task enters a system and the time it reaches a processor for service. Our models are based on considering the behavior of limiting systems where the number of processors goes to infinity. The limiting systems can be shown to accurately describe the behavior of sufficiently large systems and simulations demonstrate that they are reasonably accurate even for systems with a small number of processors. Our studies of specific models demonstrate the importance of using randomness to break symmetry in these systems and yield important rules of thumb for system design. The most significant result is that only small amounts of queue length information can be extremely useful in these settings; for example, having incoming tasks choose the least loaded of two randomly chosen processors is extremely effective over a large range of possible system parameters. In contrast, using global information can actually degrade performance unless used carefully; for example, unlike most settings where the load information is current, having tasks go to the apparently least loaded server can significantly hurt performance. Index TermsÐLoad balancing, stale information, old information, queuing theory, large deviations. æ 1
On the Analysis of Randomized Load Balancing Schemes
 IN PROCEEDINGS OF THE 9TH ANNUAL ACM SYMPOSIUM ON PARALLEL ALGORITHMS AND ARCHITECTURES
, 1998
"... It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends ..."
Abstract

Cited by 55 (7 self)
 Add to MetaCart
It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Our work extends a previous analysis of the supermarket model, a model that abstracts a simple, efficient load balancing scheme in the setting where jobs arrive at a large system of parallel processors. In this model, customers arrive at a system of n servers as a Poisson stream of rate #n, # < 1, with service requirements exponentially distributed with mean 1. Each customer chooses d servers independently and uniformly at random from the n servers, and is served according to the First In First Out (FIFO) protocol at the choice with the fewest customers. For the supermarket model, it has been shown that using d = 2 choices yields an exponential improvement in the expected time a customer spends in the syst...
Setting 2 variables at a time yields a new lower bound for random 3SAT (Extended Abstract)
 STOC
, 2000
"... Let X be a set of n Boolean variables and denote by C(X) the set of all 3clauses over X, i.e. the set of all 8(3) possible disjunctions of three distinct, noncomplementary literais from variables in X. Let F(n, m) be a random 3SAT formula formed by selecting, with replacement, m clauses uniformly ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
Let X be a set of n Boolean variables and denote by C(X) the set of all 3clauses over X, i.e. the set of all 8(3) possible disjunctions of three distinct, noncomplementary literais from variables in X. Let F(n, m) be a random 3SAT formula formed by selecting, with replacement, m clauses uniformly at random from C(X) and taking their conjunction. The satisfiability threshold conjecture asserts that there exists a constant ra such that as n+ c¢, F(n, rn) is satisfiable with probability that tends to 1 if r < ra, but unsatisfiable with probability that tends to 1 if r:> r3. Experimental evidence suggests rz ~ 4.2. We prove rz> 3.145 improving over the previous best lower bound r3> 3.003 due to Frieze and Suen. For this, we introduce a satisfiability heuristic that works iteratively, permanently setting the value of a pair of variables in each round. The framework we develop for the analysis of our heuristic allows us to also derive most previous lower bounds for random 3SAT in a uniform manner and with little effort.
Analyses of Load Stealing Models Based on Differential Equations
 In Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures
, 1998
"... In this paper we develop models for and analyze several randomized work stealing algorithms in a dynamic setting. Our models represent the limiting behavior of systems as the number of processors grows to infinity using differential equations. The advantages of this approach include the ability to m ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
In this paper we develop models for and analyze several randomized work stealing algorithms in a dynamic setting. Our models represent the limiting behavior of systems as the number of processors grows to infinity using differential equations. The advantages of this approach include the ability to model a large variety of systems and to provide accurate numerical approximations of system behavior even when the number of processors is relatively small. We show how this approach can yield significant intuition about the behavior of work stealing algorithms in realistic settings.
A Steady State Analysis of Diffracting Trees
, 1997
"... Diffracting trees are an effective and highly scalable distributedparallel technique for shared counting and load balancing. This paper presents the first steadystate combinatorial model and analysis for diffracting trees, and uses it to answer several critical algorithmic design questions. Our mo ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
Diffracting trees are an effective and highly scalable distributedparallel technique for shared counting and load balancing. This paper presents the first steadystate combinatorial model and analysis for diffracting trees, and uses it to answer several critical algorithmic design questions. Our model is simple and sufficiently high level to overcome many implementation specific details, and yet as we will show it is rich enough to accurately predict empirically observed behaviors. As a result of our analysis we were able to identify starvation problems in the original diffracting tree algorithm and modify it to a create a more stable version. We are also able to identify the range in which the diffracting tree performs most efficiently, and the ranges in which its performance degrades. We believe our model and modeling approach openthewayto steadystate analysis of other distributedparallel structures such as counting networks and elimination trees.
ERROR ANALYSIS OF TAULEAP SIMULATION METHODS
, 2011
"... We perform an error analysis for numerical approximation methods of continuous time Markov chain models commonly found in the chemistry and biochemistry literature. The motivation for the analysis is to be able to compare the accuracy of different approximation methods and, specifically, Euler taul ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
We perform an error analysis for numerical approximation methods of continuous time Markov chain models commonly found in the chemistry and biochemistry literature. The motivation for the analysis is to be able to compare the accuracy of different approximation methods and, specifically, Euler tauleaping and midpoint tauleaping. We perform our analysis under a scaling in which the size of the time discretization is inversely proportional to some (bounded) power of the norm of the state of the system. We argue that this is a more appropriate scaling than that found in previous error analyses in which the size of the time discretization goes to zero independent of the rest of the model. Under the present scaling, we show that midpoint tauleaping achieves a higher order of accuracy, in both a weak and a strong sense, than Euler tauleaping; a result that is in contrast to previous analyses. We present examples that demonstrate our findings.
On LargeScale PeertoPeer Streaming Systems with Network Coding
"... Live peertopeer (P2P) streaming has recently received much research attention, with successful commercial systems showing its viability in the Internet. Nevertheless, existing analytical studies of P2P streaming systems have failed to mathematically investigate and understand their critical proper ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Live peertopeer (P2P) streaming has recently received much research attention, with successful commercial systems showing its viability in the Internet. Nevertheless, existing analytical studies of P2P streaming systems have failed to mathematically investigate and understand their critical properties, especially with a large scale and under extreme dynamics such as a flash crowd scenario. Even more importantly, there exists no prior analytical work that focuses on an entirely new way of designing streaming protocols, with the help of network coding. In this paper, we seek to show an indepth analytical understanding of fundamental properties of P2P streaming systems, with a particular spotlight on the benefits of network coding. We show that, if network coding is used according to certain design principles, provably good performance can be guaranteed, with respect to high playback qualities, short initial buffering delays, resilience to peer dynamics, as well as minimal bandwidth costs on dedicated streaming servers. Our results are obtained with mathematical rigor, but without sacrificing realistic assumptions of system scale, peer dynamics, and upload capacities. For further insights, streaming systems using network coding are compared with traditional pullbased streaming in largescale simulations, with a focus on fundamentals, rather than protocol details. The scale of our simulations throughout this paper exceeds 200, 000 peers at times, which is in sharp contrast with existing empirical studies, typically with a few hundred peers involved.