Handling Churn in a DHT
 In Proceedings of the USENIX Annual Technical Conference
, 2004
This paper addresses the problem of churnthe continuous process of node arrival and departurein distributed hash tables (DHTs). We argue that DHTs should perform lookups quickly and consistently under churn rates at least as high as those observed in deployed P2P systems such as Kazaa. We then show through experiments on an emulated network that current DHT implementations cannot handle such churn rates. Next, we identify and explore three factors affecting DHT performance under churn: reactive versus periodic failure recovery, message timeout calculation, and proximity neighbor selection. We work in the context of a mature DHT implementation called Bamboo, using the ModelNet network emulator, which models innetwork queuing, crosstraffic, and packet loss. These factors are typically missing in earlier simulationbased DHT studies, and we show that careful attention to them in Bamboo's design allows it to function effectively at churn rates at or higher than that observed in P2P filesharing applications, while using lower maintenance bandwidth than other DHT implementations.
GossipBased Computation of Aggregate Information
, 2003
between computers, and a resulting paradigm shift from centralized to highly distributed systems. With massive scale also comes massive instability, as node and link failures become the norm rather than the exception. For such highly volatile systems, decentralized gossipbased protocols are emerging as an approach to maintaining simplicity and scalability while achieving faulttolerant information dissemination.
Building Secure and Reliable Network Applications
, 1996
ly, the remote procedure call problem, which an RPC protocol undertakes to solve, consists of emulating LPC using message passing. LPC has a number of "properties"  a single procedure invocation results in exactly one execution of the procedure body, the result returned is reliably delivered to the invoker, and exceptions are raised if (and only if) an error occurs. Given a completely reliable communication environment, which never loses, duplicates, or reorders messages, and given client and server processes that never fail, RPC would be trivial to solve. The sender would merely package the invocation into one or more messages, and transmit these to the server. The server would unpack the data into local variables, perform the desired operation, and send back the result (or an indication of any exception that occurred) in a reply message. The challenge, then, is created by failures. Were it not for the possibility of process and machine crashes, an RPC protocol capable of overcomi...
The Peer Sampling Service: Experimental Evaluation of Unstructured GossipBased Implementations
 In Middleware ’04: Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
, 2004
Abstract. In recent years, the gossipbased communication model in largescale distributed systems has become a general paradigm with important applications which include information dissemination, aggregation, overlay topology management and synchronization. At the heart of all of these protocols lies a fundamental distributed abstraction: the peer sampling service. In short, the aim of this service is to provide every node with peers to exchange information with. Analytical studies reveal a high reliability and efficiency of gossipbased protocols, under the (often implicit) assumption that the peers to send gossip messages to are selected uniformly at random from the set of all nodes. In practice—instead of requiring all nodes to know all the peer nodes so that a random sample could be drawn—a scalable and efficient way to implement the peer sampling service is by constructing and maintaining dynamic unstructured overlays through gossiping membership information itself. This paper presents a generic framework to implement reliable and efficient peer sampling services. The framework generalizes existing approaches and makes it easy to introduce new ones. We use this framework to explore and compare several implementations of our abstract scheme. Through extensive experimental analysis, we show that all of them lead to different peer sampling services none of which is uniformly random. This clearly renders traditional theoretical approaches invalid, when the underlying peer sampling service is based on a gossipbased scheme. Our observations also help explain important differences between design choices of peer sampling algorithms, and how these impact the reliability of the corresponding service. 1
Spatial gossip and resource location protocols
, 2001
The dynamic behavior of a network in which information is changing continuously over time requires robust and efficient mechanisms for keeping nodes updated about new information. Gossip protocols are mechanisms for this task in which nodes communicate with one another according to some underlying deterministic or randomized algorithm, exchanging information in each communication step. In a variety of contexts, the use of randomization to propagate information has been found to provide better reliability and scalability than more regimented deterministic approaches. In many settings, such as a cluster of distributed computing hosts, new information is generated at individual nodes, and is most “interesting ” to nodes that are nearby. Thus, we propose distancebased propagation bounds as a performance measure for gossip mechanisms: a node at distance d from the origin of a new piece of information should be able to learn about this information with a delay that grows slowly with d, and is independent of the size of the network. For nodes arranged with uniform density in Euclidean space, we present natural gossip mechanisms, called spatial gossip, that satisfy such a guarantee: new information is spread to
Protocols and impossibility results for gossipbased communication mechanisms
, 2002
In recent years, gossipbased algorithms have gained prominence as a methodology for designing robust and scalable communication schemes in large distributed systems. The premise underlying distributed gossip is very simple: in each time step, each node v in the system selects some other node w as a communication partner — generally by a simple randomized rule — and exchanges information with w; over a period of time, information spreads through the system in an “epidemic fashion”. A fundamental issue which is not well understood is the following: how does the underlying lowlevel gossip mechanism — the means by which communication partners are chosen — affect one’s ability to design efficient highlevel gossipbased protocols? We establish one of the first concrete results addressing this question, by showing a fundamental limitation on the power of the commonly used uniform gossip mechanism for solving nearestresource location problems. In contrast, very efficient protocols for this problem can be designed using a nonuniform spatial gossip mechanism, as established in earlier work with Alan Demers. We go on to consider the design of protocols for more complex problems, providing an efficient distributed gossipbased protocol for a set of nodes in Euclidean space to construct an approximate minimum spanning tree. Here too, we establish a contrasting limitation on the power of uniform gossip for solving this problem. Finally, we investigate gossipbased packet routing as a primitive that underpins the communication patterns in many protocols, and as a way to understand the capabilities of different gossip mechanisms at a general level.
Computing separable functions via gossip
 In Proceedings of the TwentyFifth Annual ACM Symposium on Principles of Distributed Computing (PODC
, 2006
Motivated by applications to sensor, peertopeer, and adhoc networks, we study the problem of computing functions of values at the nodes in a network in a totally distributed manner. In particular, we consider separable functions, which can be written as linear combinations or products of functions of individual variables. The main contribution of this paper is the design of a distributed algorithm for computing separable functions based on properties of exponential random variables. We bound the running time of our algorithm in terms of the running time of an information spreading algorithm used as a subroutine by the algorithm. Since we are interested in totally distributed algorithms, we consider a randomized gossip mechanism for information spreading as the subroutine. Combining these algorithms yields a complete and simple distributed algorithm for computing separable functions. The second contribution of this paper is a characterization of the information spreading time of the gossip algorithm, and therefore the computation time for separable functions, in terms of the conductance of an appropriate stochastic matrix. Specifically, we find that for a class of graphs with small spectral gap, this time is of a smaller order than the time required to compute averages for a known iterative gossip scheme [4]. 1
Gossip versus Deterministic Flooding: Low Message Overhead and High Reliability for Broadcasting on Small Networks
Rumor mongering (also known as gossip) is an epidemiological protocol that implements broadcasting with a reliability that can be very high. Rumor mongering is attractive because it is generic, scalable, adapts well to failures and recoveries, and has a reliability that gracefully degrades with the number of failures in a run. However, rumor mongering uses random selection for communications. We study the impact of using random selection in this paper. We present a protocol that superficially resembles rumor mongering but is deterministic. We show that this new protocol has most of the same attractions as rumor mongering. The one attraction that rumor mongering hasnamely graceful degradationcomes at a high cost in terms of the number of messages sent. We compare the two approaches both at an abstract level and in terms of how they perform in an Ethernet and small wide area network of Ethernets.
Distributed Approaches to Triangulation and Embedding
 In Proceedings 16th ACMSIAM Symposium on Discrete Algorithms (SODA
, 2005
A number of recent papers in the networking community study the distance matrix defined by the nodetonode latencies in the Internet and, in particular, provide a number of quite successful distributed approaches that embed this distance into a lowdimensional Euclidean space. In such algorithms it is feasible to measure distances among only a linear or nearlinear number of node pairs; the rest of the distances are simply not available. Moreover, for applications it is desirable to spread the load evenly among the participating nodes. Indeed, several recent studies use this ’fully distributed ’ approach and achieve, empirically, a low distortion for all but a small fraction of node pairs. This is concurrent with the large body of theoretical work on metric embeddings, but there is a fundamental distinction: in the theoretical approaches to metric embeddings, full and centralized access to the distance matrix is assumed and heavily used. In this paper we present the first fully distributed embedding algorithm with provable distortion guarantees for doubling metrics (which have been proposed as a reasonable abstraction of Internet latencies), thus providing some insight into the empirical success of the recent Vivaldi algorithm [7]. The main ingredient of our embedding algorithm is an improved fully distributed algorithm for a more basic problem of triangulation, where the triangle inequality is used to infer the distances that have not been measured; this problem received a considerable attention in the networking community, and has also been studied theoretically in [19]. We use our techniques to extend ɛrelaxed embeddings and triangulations to infinite metrics and arbitrary measures, and to improve on the approximate distance labeling scheme of Talwar [36]. 1
On diffusing updates in a byzantine environment
 In Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
, 1999
We study how to efficiently diffuse updates to a large distributed system of data replicas, some of which may exhibit arbitrary (Byzantine) failures. We assume that strictly fewer than t replicas fail, and that each update is initially received by at least t correct replicas. The goal is to diffuse each update to all correct replicas while ensuring that correct replicas accept no updates generated spuriously by faulty replicas. To achieve reliable diffusion, each correct replica accepts an update only after receiving it from at least t others. We provide the first analysis of epidemicstyle protocols for such environments. This analysis is fundamentally different from known analyses for the benign case due to our treatment of fully Byzantine failures—which, among other things, precludes the use of digital signatures for authenticating forwarded updates. We propose two epidemicstyle diffusion algorithms and two measures that characterize the efficiency of diffusion algorithms in general. We characterize both of our algorithms according to these measures, and also prove lower bounds with regards to these measures that show that our algorithms are close to optimal. 1