Results 1 - 10
of
16
Bandwidth-efficient management of DHT routing tables
, 2005
"... Today an application developer using a distributed hash table (DHT) with n nodes must choose a DHT protocol from the spectrum between O(1) lookup protocols [9, 18] and O(log n) protocols [20–23,25,26]. O(1) protocols achieve low latency lookups on small or low-churn networks because lookups take onl ..."
Abstract
-
Cited by 45 (4 self)
- Add to MetaCart
Today an application developer using a distributed hash table (DHT) with n nodes must choose a DHT protocol from the spectrum between O(1) lookup protocols [9, 18] and O(log n) protocols [20–23,25,26]. O(1) protocols achieve low latency lookups on small or low-churn networks because lookups take only a few hops, but incur high maintenance traffic on large or high-churn networks. O(log n) protocols incur less maintenance traffic on large or highchurn networks but require more lookup hops in small networks. Accordion is a new routing protocol that does not force the developer to make this choice: Accordion adjusts itself to provide the best performance across a range of network sizes and churn rates while staying within a bounded bandwidth budget. The key challenges in the design of Accordion are the algorithms that choose the routing table’s size and content. Each Accordion node learns of new neighbors opportunistically, in a way that causes the density of its neighbors to be inversely proportional to their distance in ID space from the node. This distribution allows Accordion to vary the table size along a continuum while still guaranteeing at most O(log n) lookup hops. The user-specified bandwidth budget controls the rate at which a node learns about new neighbors. Each node limits its routing table size by evicting neighbors that it judges likely to have failed. High churn (i.e., short node lifetimes) leads to a high eviction rate. The equilibrium between the learning and eviction processes determines the table size. Simulations show that Accordion maintains an efficient lookup latency versus bandwidth tradeoff over a wider range of operating conditions than existing DHTs.
Modeling heterogeneous user churn and local resilience of unstructured p2p networks
- In ICNP
, 2006
"... Abstract — Previous analytical results on the resilience of unstructured P2P systems have not explicitly modeled heterogeneity of user churn (i.e., difference in online behavior) or the impact of in-degree on system resilience. To overcome these limitations, we introduce a generic model of heterogen ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Abstract — Previous analytical results on the resilience of unstructured P2P systems have not explicitly modeled heterogeneity of user churn (i.e., difference in online behavior) or the impact of in-degree on system resilience. To overcome these limitations, we introduce a generic model of heterogeneous user churn, derive the distribution of the various metrics observed in prior experimental studies (e.g., lifetime distribution of joining users, joint distribution of session time of alive peers, and residual lifetime of a randomly selected user), derive several closed-form results on the transient behavior of in-degree, and eventually obtain the joint in/out degree isolation probability as a simple extension of the out-degree model in [13]. I.
Self Management and the Future of Software Design
- In: Formal Aspects of Component Software (FACS
, 2006
"... Most software is fragile: even the slightest error, such as changing a single bit, can make it crash. As software complexity has increased, development techniques have kept pace to manage this fragility. But today there is a new challenge. Complexity is increasing rapidly as a result of two factors: ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Most software is fragile: even the slightest error, such as changing a single bit, can make it crash. As software complexity has increased, development techniques have kept pace to manage this fragility. But today there is a new challenge. Complexity is increasing rapidly as a result of two factors: the increasing use of distributed systems as a result of the sufficient reliability and bandwidth of the Internet, and the increasing scale of these systems as a result of the addition of many new computers to the Internet (e.g., mobile phones and other devices). To manage this new complexity, we propose an approach based on selfmanaging systems: systems that can maintain useful functionality despite changes in their environment. The paper motivates this approach and gives some ideas on how to build general self-managing software systems. An important part of the approach is to build systems as hierarchies of interacting feedback loops. We give several examples of these systems and we deduce some rules of thumb for their design.
On Static and Dynamic Partitioning Behavior of Large-Scale P2P Networks
, 2008
"... In this paper, we analyze the problem of network disconnection in the context of large-scale P2P networks and understand how both static and dynamic patterns of node failure affect the resilience of such graphs. We start by applying classical results from random graph theory to show that a large va ..."
Abstract
-
Cited by 9 (9 self)
- Add to MetaCart
In this paper, we analyze the problem of network disconnection in the context of large-scale P2P networks and understand how both static and dynamic patterns of node failure affect the resilience of such graphs. We start by applying classical results from random graph theory to show that a large variety of deterministic and random P2P graphs almost surely (i.e., with probability 1 (1)) remain connected under random failure if and only if they have no isolated nodes. This simple, yet powerful, result subsequently allows us to derive in closed-form the probability that a P2P network develops isolated nodes, and therefore partitions, under both types of node failure. We finish the paper by demonstrating that our models match simulations very well and that dynamic P2P systems are extremely resilient under node churn as long as the neighbor replacement delay is much smaller than the average user lifetime.
Self-management of large-scale distributed systems by combining peer-to-peer networks and components
, 2005
"... Abstract. This position paper envisions making large-scale distributed applications self managing by combining component models and structured overlay networks. A key obstacle to deploying large-scale applications running on Internet is the amount of management they require. Often these applications ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract. This position paper envisions making large-scale distributed applications self managing by combining component models and structured overlay networks. A key obstacle to deploying large-scale applications running on Internet is the amount of management they require. Often these applications demand specialized personnel for their maintenance. Making applications self-managing will help removing this obstacle. Basing the system on a structured overlay network will allow extending the abilities of existing component models to large-scale distributed systems. Structured overlay networks provide guarantees for efficient communication, efficient load-balancing, and self-manage in case of joins, leaves, and failures. Component models, on the other hand, support dynamic configuration, the ability of part of the system to reconfigure other parts at run-time. By combining overlay networks with component models we achieve both low-level as well as high-level self-management. We will target multi-tier applications, and specifically we will consider three-tier applications using a self-managing storage service. 1
Residual-Based Estimation of Peer and Link Lifetimes in P2P Networks
"... Abstract—Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-Based Method (CBM), which divides a given observation window into two halves and samples users “created ” in the first half every 1 time units until they die or the observation period ends. Despite i ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Abstract—Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-Based Method (CBM), which divides a given observation window into two halves and samples users “created ” in the first half every 1 time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we first derive a model for CBM and show that small window size or large 1 may lead to highly inaccurate lifetime distributions. We then show that createbased sampling exhibits an inherent tradeoff between overhead and accuracy, which does not allow any fundamental improvement to the method. Instead, we propose a completely different approach for sampling user dynamics that keeps track of only residual lifetimes of peers and uses a simple renewal-process model to recover the actual lifetimes from the observed residuals. Our analysis indicates that for reasonably large systems, the proposed method can reduce bandwidth consumption by several orders of magnitude compared to prior approaches while simultaneously achieving higher accuracy. We finish the paper by implementing a two-tier Gnutella network crawler equipped with the proposed sampling method and obtain the distribution of ultrapeer lifetimes in a network of 6.4 million users and 60 million links. Our experimental results show that ultrapeer lifetimes are Pareto with shape 1 1; however, link lifetimes exhibit much lighter tails with 1 8. Index Terms—Gnutella networks, lifetime estimation, peer-topeer, residual sampling. I.
Residual-based measurement of peer and link lifetimes in gnutella networks
- In IEEE InfoCom
, 2007
"... Abstract—Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-Based Method (CBM) [16], which divides a given observation window into two halves and samples users “created ” in the first half every ∆ time units until they die or the observation period ends. Desp ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract—Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-Based Method (CBM) [16], which divides a given observation window into two halves and samples users “created ” in the first half every ∆ time units until they die or the observation period ends. Despite its frequent use [2], [17], [19], this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we first derive a model for CBM and show that small window size or large ∆ may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent tradeoff between overhead and accuracy, which does not allow any fundamental improvement to the method. Instead, we propose a completely different approach for sampling user dynamics that keeps track of only residual lifetimes of peers and uses a simple renewal-process model to recover the actual lifetimes from the observed residuals. Our analysis indicates that for reasonably large systems, the proposed method can reduce bandwidth consumption by several orders of magnitude compared to prior approaches while simultaneously achieving higher accuracy. We finish the paper by implementing a two-tier Gnutella network crawler equipped with the proposed sampling method and obtain the distribution of ultrapeer lifetimes in a network of 6.4 million users and 60 million links. Our experimental results show that ultrapeer lifetimes are Pareto with shape α ≈ 1.1; however, link lifetimes exhibit much lighter tails with α ≈ 1.9. I.
Overcoming Software Fragility with Interacting Feedback Loops
"... Programs are fragile for many reasons, including software errors, partial failures, and network problems. One way to make software more robust is to design it from the start as a set of interacting feedback loops. Studying and using feedback loops is an old idea that dates back at least to Norbert W ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Programs are fragile for many reasons, including software errors, partial failures, and network problems. One way to make software more robust is to design it from the start as a set of interacting feedback loops. Studying and using feedback loops is an old idea that dates back at least to Norbert Wiener’s work on Cybernetics. But almost all work in this area has focused on single feedback loops. We show that it is important to design software with multiple interacting feedback loops. We present examples taken from both biology and software to substantiate this. To make this idea practical, a necessary condition is good support for concurrent programming. We find that a message-passing model without shared state works well. Our own work focuses on extending structured overlay networks (a generalization of peer-to-peer networks) for large-scale distributed applications. Structured overlay networks are a good example of systems designed from the start as interacting feedback loops. We show how to extend them with a distributed transaction layer that keeps their good self-organization properties. We are using this system to build three realistic application scenarios taken from industrial case studies. 1
Understanding Disconnection and Stabilization of Chord
"... Abstract—Previous analytical work [15], [16] on the resilience of P2P networks has been restricted to disconnection arising from simultaneous failure of all neighbors in routing tables of participating users. In this paper, we focus on a different technique for maintaining consistent graphs – Chord’ ..."
Abstract
- Add to MetaCart
Abstract—Previous analytical work [15], [16] on the resilience of P2P networks has been restricted to disconnection arising from simultaneous failure of all neighbors in routing tables of participating users. In this paper, we focus on a different technique for maintaining consistent graphs – Chord’s successor sets and periodic stabilizations – under both static and dynamic node failure. We derive closed-form models for the probability that Chord remains connected under both types of node failure and show the effect of using different stabilization interval lengths (i.e., exponential, uniform, and constant) on the probability of partitioning in Chord. I.

