Results 1 - 10
of
35
Minimizing Churn in Distributed Systems
- IN PROC. ACM SIGCOMM
, 2006
"... A pervasive requirement of distributed systems is to deal with churn -- change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
A pervasive requirement of distributed systems is to deal with churn -- change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of available nodes to use. First,
Proactive replication for data durability
- In Proceedings of the 5th Int’l Workshop on Peer-to-Peer Systems (IPTPS
, 2006
"... Many wide-area storage systems replicate data for durability. A common way of maintaining the replicas is to detect node failures and respond by creating additional copies of objects that were stored on failed nodes and hence suffered a loss of redundancy. Reactive techniques can minimize total byte ..."
Abstract
-
Cited by 28 (6 self)
- Add to MetaCart
Many wide-area storage systems replicate data for durability. A common way of maintaining the replicas is to detect node failures and respond by creating additional copies of objects that were stored on failed nodes and hence suffered a loss of redundancy. Reactive techniques can minimize total bytes sent since they only create replicas as needed; however, they can create spikes in network use after a failure. These spikes may overwhelm application traffic and can make it difficult to provision bandwidth. This paper explores a proactive approach that creates additional copies not in response to failures, but periodically at a fixed low rate. We introduce Tempo, a distributed hash table that allows each user to specify a maximum maintenance bandwidth and uses it to perform proactive replication. Results from a simulation study suggest that Tempo can deliver high durability despite only using several kilobytes per second of bandwidth, comparable to state-ofthe-art reactive systems. 1.
OverCite: A Cooperative Digital Research Library
, 2005
"... CiteSeer is a well-known online resource for the computer science research community, allowing users to search and browse a large archive of research papers. Unfortunately, its current centralized incarnation is costly to run. Although members of the community would presumably be willing to donate h ..."
Abstract
-
Cited by 24 (9 self)
- Add to MetaCart
CiteSeer is a well-known online resource for the computer science research community, allowing users to search and browse a large archive of research papers. Unfortunately, its current centralized incarnation is costly to run. Although members of the community would presumably be willing to donate hardware and bandwidth at their own sites to assist CiteSeer, the current architecture does not facilitate such distribution of resources. OverCite is a design for a new architecture for a distributed and cooperative research library based on a distributed hash table (DHT). The new architecture harnesses donated resources at many sites to provide document search and retrieval service to researchers worldwide. A preliminary evaluation of an initial OverCite prototype shows that it can service more queries per second than a centralized system, and that it increases total storage capacity by a factor of n/4 in a system of n nodes. OverCite can exploit these additional resources by supporting new features such as document alerts, and by scaling to larger data sets.
A Distributed Hash Table
, 2005
"... DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintain ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintains a mapping of objects to servers, replicates data for durability, and balances load across participating servers. Applications access data stored in DHash through a familiar hash-table interface: put stores data in the system under a key; get retrieves the data. DHash has proven useful to a number of application builders and has been used to build a content-distribution system [34], a Usenet replacement [118], and new Internet naming architectures [133, 132]. These applications demand low-latency, high-throughput access
Ensuring content integrity for untrusted peer-to-peer content distribution networks
- In Proc. 4th USENIX/ACM NSDI
, 2007
"... Many existing peer-to-peer content distribution networks (CDNs) such as Na Kika, CoralCDN, and CoDeeN are deployed on PlanetLab, a relatively trusted environment. But scaling them beyond this trusted boundary requires protecting against content corruption by untrusted replicas. This paper presents R ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Many existing peer-to-peer content distribution networks (CDNs) such as Na Kika, CoralCDN, and CoDeeN are deployed on PlanetLab, a relatively trusted environment. But scaling them beyond this trusted boundary requires protecting against content corruption by untrusted replicas. This paper presents Repeat and Compare, a system for ensuring content integrity in untrusted peer-to-peer CDNs even when replicas dynamically generate content. Repeat and Compare detects misbehaving replicas through attestation records and sampled repeated execution. Attestation records, which are included in responses, cryptographically bind replicas to their code, inputs, and dynamically generated output. Clients then forward a fraction of these records to randomly selected replicas acting as verifiers. Verifiers, in turn, reliably identify misbehaving replicas by locally repeating response generation and comparing their results with the attestation records. We have implemented our system on top of Na Kika. We quantify its detection guarantees through probabilistic analysis and show through simulations that a small sample of forwarded records is sufficient to effectively and promptly cleanse a CDN, even if large fractions of replicas or verifiers are misbehaving. 1
D1HT: A Distributed One Hop Hash Table
- In Proc of the 20th IEEE Intl Parallel & Distributed Processing Symposium (IPDPS
, 2005
"... Distributed Hash Tables (DHTs) have been used in a variety of applications, but most DHTs so far have opted to solve lookups with multiple hops, which sacrifices performance in order to keep little routing information and minimize maintenance traffic. In this paper, we introduce D1HT, a novel single ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Distributed Hash Tables (DHTs) have been used in a variety of applications, but most DHTs so far have opted to solve lookups with multiple hops, which sacrifices performance in order to keep little routing information and minimize maintenance traffic. In this paper, we introduce D1HT, a novel single hop DHT that is able to maximize performance with reasonable maintenance traffic overhead even for huge and dynamic peer-to-peer (P2P) systems. We formally define the algorithm we propose to detect and notify any membership change in the system, prove its correctness and performance properties, and present a Quarantine-like mechanism to reduce the overhead caused by volatile peers. Our analyses show that D1HT has reasonable maintenance bandwidth requirements even for very large systems, while presenting at least twice less bandwidth overhead than previous single hop DHT.
A comparison of structured and unstructured P2P approaches to heterogeneous random peer selection
- In Proc. Usenix Annual Technical Conference
, 2007
"... Random peer selection is used by numerous P2P applications; examples include application-level multicast, unstructured file sharing, and network location mapping. In most of these applications, support for a heterogeneous capacity distribution among nodes is desirable: in other words, nodes with hig ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Random peer selection is used by numerous P2P applications; examples include application-level multicast, unstructured file sharing, and network location mapping. In most of these applications, support for a heterogeneous capacity distribution among nodes is desirable: in other words, nodes with higher capacity should be selected proportionally more often. Random peer selection can be performed over both structured and unstructured graphs. This paper compares these two basic approaches using a candidate example from each approach. For unstructured heterogeneous random peer selection, we use Swaplinks, from our previous work. For the structured approach, we use the Bamboo DHT adapted to heterogeneous selection using our extensions to the item-balancing technique by Karger and Ruhl. Testing the two approaches over graphs of 1000 nodes and a range of network churn levels and heterogeneity distributions, we show that Swaplinks is the superior random selection approach: (i) Swaplinks enables more accurate random selection than does the structured approach in the presence of churn, and (ii) The structured approach is sensitive to a number of hard-to-set tuning knobs that affect performance, whereas Swaplinks is essentially free of such knobs. 1
Energy Consumption and Conservation in Mobile Peer-toPeer Systems
- In Proceedings of the 1 st ACM International Workshop on Decentralized Resource Sharing in Mobile Computing and Networking (MobiShare
, 2006
"... Today’s mobile devices are growing in number and computational resources. Devices capable of storing gigabytes of digital content are becoming ubiquitous, making them an ideal platform for peer-to-peer content delivery and sharing. However, the alwayson communication patterns of P2P networks is not ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Today’s mobile devices are growing in number and computational resources. Devices capable of storing gigabytes of digital content are becoming ubiquitous, making them an ideal platform for peer-to-peer content delivery and sharing. However, the alwayson communication patterns of P2P networks is not a natural fit for energy-constrained mobile devices. In this paper, we perform a detailed study of energy consumption of a structured P2P overlay on a PDA device. Using actual energy measurements, we present energy consumption results for different type of operations in P2P overlays. Based on these observations, we implement an approach to improve energy conservation on P2P protocols and show some promising preliminary results.
Optimal Search Performance in Unstructured Peer-to-Peer Networks with Clustered Demands
, 2005
"... This paper derives the optimal search time and the optimal search cost that can be achieved in unstructured peer-topeer networks when the demand pattern exhibits clustering (i.e. file popularities vary from region to region in the network). Previous work in this area had assumed a uniform distributi ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper derives the optimal search time and the optimal search cost that can be achieved in unstructured peer-topeer networks when the demand pattern exhibits clustering (i.e. file popularities vary from region to region in the network). Previous work in this area had assumed a uniform distribution of file replicas throughout the network with an implicit or explicit assumption of uniform file popularity distribution whereas in reality, there is clear evidence of clustering in file popularity patterns. In this paper, we provide mechanisms for modeling clustering in file popularity distributions and the consequent nonuniform distribution of file replicas. We provide results for the search time in such networks for both random walk and flooding search mechanisms. The potential performance benefit that the clustering in demand patterns affords is captured by our results. Interestingly, the performance gains are shown to be independent of whether the search network topology reflects the clustering in file popularity. We also provide the relation between the queryprocessing load and the number of replicas of each file for the clustered demands case showing that flooding searches may have lower query-processing load than random walk searches in the clustered demands case.

