Results 1 - 10
of
27
Efficient replica maintenance for distributed storage systems
- In Proc. of NSDI
, 2006
"... This paper considers replication strategies for storage systems that aggregate the disks of many nodes spread over the Internet. Maintaining replication in such systems can be prohibitively expensive, since every transient network or host failure could potentially lead to copying a server’s worth of ..."
Abstract
-
Cited by 80 (18 self)
- Add to MetaCart
This paper considers replication strategies for storage systems that aggregate the disks of many nodes spread over the Internet. Maintaining replication in such systems can be prohibitively expensive, since every transient network or host failure could potentially lead to copying a server’s worth of data over the Internet to maintain replication levels. The following insights in designing an efficient replication algorithm emerge from the paper’s analysis. First, durability can be provided separately from availability; the former is less expensive to ensure and a more useful goal for many wide-area applications. Second, the focus of a durability algorithm must be to create new copies of data objects faster than permanent disk failures destroy the objects; careful choice of policies for what nodes should hold what data can decrease repair time. Third, increasing the number of replicas of each data object does not help a system tolerate a higher disk failure probability, but does help tolerate bursts of failures. Finally, ensuring that the system makes use of replicas that recover after temporary failure is critical to efficiency. Based on these insights, the paper proposes the Carbonite replication algorithm for keeping data durable at a low cost. A simulation of Carbonite storing 1 TB of data over a 365 day trace of PlanetLab activity shows that Carbonite is able to keep all data durable and uses 44 % more network traffic than a hypothetical system that only responds to permanent failures. In comparison, Total Recall and DHash require almost a factor of two more network traffic than this hypothetical system. 1
OASIS: Anycast for Any Service
, 2006
"... Global anycast, an important building block for many distributed services, faces several challenging requirements. First, anycast response must be fast and accurate. Second, the anycast system must minimize probing to reduce the risk of abuse complaints. Third, the system must scale to many services ..."
Abstract
-
Cited by 69 (8 self)
- Add to MetaCart
Global anycast, an important building block for many distributed services, faces several challenging requirements. First, anycast response must be fast and accurate. Second, the anycast system must minimize probing to reduce the risk of abuse complaints. Third, the system must scale to many services and provide high availability. Finally, and most importantly, such a system must integrate seamlessly with unmodified client applications. In short, when a new client makes an anycast query for a service, the anycast system must ideally return an accurate reply without performing any probing at all. This paper
Proling a million user dht
- In Proc. of Internet Measurement Conference
, 2007
"... Distributed hash tables (DHTs) provide scalable, key-based lookup of objects in dynamic network environments. Although DHTs have been studied extensively from an analytical perspective, only recently have wide deployments enabled empirical examination. This paper reports measurement results obtained ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
Distributed hash tables (DHTs) provide scalable, key-based lookup of objects in dynamic network environments. Although DHTs have been studied extensively from an analytical perspective, only recently have wide deployments enabled empirical examination. This paper reports measurement results obtained from profiling the Azureus BitTorrent client’s DHT, which is in active use by more than 1 million nodes on a daily basis. The Azureus DHT operates on untrusted, unreliable end-hosts, offering a glimpse into the implementation challenges associated with making structured overlays work in practice. Our measurements provide characterizations of churn, overhead, and performance in this environment. We leverage these measurements to drive the design of a modified DHT lookup algorithm that reduces median DHT lookup time by an order of magnitude for a nominal increase in overhead. 1.
A Distributed Hash Table
, 2005
"... DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintain ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
DHash is a new system that harnesses the storage and network resources of computers distributed across the Internet by providing a wide-area storage service, DHash. DHash frees applications from re-implementing mechanisms common to any system that stores data on a collection of machines: it maintains a mapping of objects to servers, replicates data for durability, and balances load across participating servers. Applications access data stored in DHash through a familiar hash-table interface: put stores data in the system under a key; get retrieves the data. DHash has proven useful to a number of application builders and has been used to build a content-distribution system [34], a Usenet replacement [118], and new Internet naming architectures [133, 132]. These applications demand low-latency, high-throughput access
Fallacies in evaluating decentralized systems
- In Proceedings of IPTPS
, 2006
"... Research on decentralized systems such as peer-to-peer overlays and ad hoc networks has been hampered by the fact that few systems of this type are in production use, and the space of possible applications is still poorly understood. As a consequence, new ideas have mostly been evaluated using commo ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Research on decentralized systems such as peer-to-peer overlays and ad hoc networks has been hampered by the fact that few systems of this type are in production use, and the space of possible applications is still poorly understood. As a consequence, new ideas have mostly been evaluated using common synthetic workloads, traces from a few existing systems, testbeds like PlanetLab, and simulators like ns-2. Some of these methods have, in fact, become the “gold standard ” for evaluating new systems, and are often a prerequisite for getting papers accepted at top conferences in the field. In this paper, we examine the current practice of evaluating decentralized systems under these specific sets of conditions and point out pitfalls associated with this practice. In particular, we argue that (i) despite authors ’ best intentions, results from such evaluations often end up being inappropriately generalized; (ii) there is an incentive not to deviate from the accepted standard of evaluation, even if that is technically appropriate; (iii) research may gravitate towards systems that are feasible and perform well when evaluated in the accepted environments; and, (iv) in the worst-case, research may become ossified as a result. We close with a call to action for the community to develop tools, data, and best practices that allow systems to be evaluated across a space of workloads and environments. 1.
MOSAIC: Unified Declarative Platform for Dynamic Overlay Composition ∗
"... Overlay networks create new networking services across nodes that communicate using pre-existing networks. MOSAIC is a unified declarative platform for constructing new overlay networks from multiple existing overlays, each possessing a subset of the desired new network’s characteristics. MOSAIC ove ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
Overlay networks create new networking services across nodes that communicate using pre-existing networks. MOSAIC is a unified declarative platform for constructing new overlay networks from multiple existing overlays, each possessing a subset of the desired new network’s characteristics. MOSAIC overlays are specified using Mozlog, a new declarative language for expressing overlay properties independently from their particular implementation or underlying network. This paper focuses on the runtime aspects of MOSAIC: composition and deployment of control and/or data plane functions of different overlay networks, dynamic compositions of overlay networks to meet changing application needs and network conditions, and seamless support for legacy applications. MOSAIC is validated experimentally using compositions specified in Mozlog: we combine an indirection overlay that supports mobility (i3), a resilient overlay (RON), and scalable lookups (Chord), to provide new overlay networks with new functions. MOSAIC uses runtime composition to simultaneously deliver application-aware mobility, NAT traversal and reliability. We further demonstrate MO-SAIC’s dynamic composition capabilities by Chord switching its underlay from IP to RON at runtime. These benefits are obtained at a low performance cost, as demonstrated by measurements on both a local cluster and PlanetLab. 1.
Comet: An active distributed key-value store
"... Distributed key-value storage systems are widely used in corporations and across the Internet. Our research seeks to greatly expand the application space for key-value storage systems through application-specific customization. We designed and implemented Comet, an extensible, distributed key-value ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Distributed key-value storage systems are widely used in corporations and across the Internet. Our research seeks to greatly expand the application space for key-value storage systems through application-specific customization. We designed and implemented Comet, an extensible, distributed key-value store. Each Comet node stores a collection of active storage objects (ASOs) that consist of a key, a value, and a set of handlers. Comet handlers run as a result of timers or storage operations, such as get or put, allowing an ASO to take dynamic, application-specific actions to customize its behavior. Handlers are written in a simple sandboxed extension language, providing properties of safety and isolation. We implemented a Comet prototype for the Vuze DHT, deployed Comet nodes on Vuze from PlanetLab, and built and evaluated over a dozen Comet applications. Our experience demonstrates that simple, safe, and restricted extensibility can significantly increase the power and range of applications that can run on distributed active storage systems. This approach facilitates the sharing of a single storage system by applications with diverse needs, allowing them to reap the consolidation benefits inherent in today’s massive clouds. 1
Group Therapy for Systems: Using link attestations to manage failures
- In IPTPS
, 2006
"... Managing failures and configuring systems properly are of critical importance for robust distributed services. Unfortunately, protocols offering strong fault-tolerance guarantees are generally too costly and insensitive to performance criteria. Yet, system management in practice is often ad-hoc and ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Managing failures and configuring systems properly are of critical importance for robust distributed services. Unfortunately, protocols offering strong fault-tolerance guarantees are generally too costly and insensitive to performance criteria. Yet, system management in practice is often ad-hoc and ill-defined, leading to under-utilized capacity or adverse effects from poorly-behaving machines. This paper proposes a new abstraction called linkattestation groups (LA-Groups) for building robust distributed systems. Developers specify application-level correctness conditions or performance requirements for nodes. Nodes vouch for each other's acceptability within small groups of nodes through digitally-signed link attestations, and then apply a link-state protocol to determine these group relationships.
Why Kad Lookup Fails
"... A Distributed Hash Table (DHT) is a structured overlay network service that provides a decentralized lookup for mapping objects to locations. In this paper, we study the lookup performance of locating nodes responsible for replicated information in Kad – one of the largest DHT networks existing curr ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
A Distributed Hash Table (DHT) is a structured overlay network service that provides a decentralized lookup for mapping objects to locations. In this paper, we study the lookup performance of locating nodes responsible for replicated information in Kad – one of the largest DHT networks existing currently. Throughout the measurement study, we found that Kad lookups locate only 18 % of nodes storing replicated data. This failure leads to limited reliability and an inefficient use of resources during lookups. Ironically, we found that this poor performance is due to the high level of routing table similarity, despite the relatively high churn rate in the network. We propose solutions which either exploit the high routing table similarity or avoid the duplicate returns using multiple target keys. 1

