• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Efficient replica maintenance for distributed storage systems (2006)

Cached

  • Download as a PDF

Download Links

  • [www.cs.berkeley.edu]
  • [berkeley.intel-research.net]
  • [www.usenix.org]
  • [www.ssrc.ucsc.edu]
  • [www.wabo.org]
  • [project-iris.net]
  • [www.cs.rice.edu]
  • [pdos.lcs.mit.edu]
  • [iris.csail.mit.edu]
  • [www.pdos.lcs.mit.edu]
  • [www.cis.upenn.edu]
  • [www.pdos.csail.mit.edu]
  • [www.mpi-sws.org]
  • [www.mpi-sws.mpg.de]
  • [pdos.csail.mit.edu]
  • [oceanstore.cs.berkeley.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Byung-gon Chun , Frank Dabek , Andreas Haeberlen , Emil Sit , Hakim Weatherspoon , M. Frans Kaashoek , John Kubiatowicz , Robert Morris
Venue:In Proc. of NSDI
Citations:79 - 17 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Chun06efficientreplica,
    author = {Byung-gon Chun and Frank Dabek and Andreas Haeberlen and Emil Sit and Hakim Weatherspoon and M. Frans Kaashoek and John Kubiatowicz and Robert Morris},
    title = {Efficient replica maintenance for distributed storage systems},
    booktitle = {In Proc. of NSDI},
    year = {2006},
    pages = {45--58}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

This paper considers replication strategies for storage systems that aggregate the disks of many nodes spread over the Internet. Maintaining replication in such systems can be prohibitively expensive, since every transient network or host failure could potentially lead to copying a server’s worth of data over the Internet to maintain replication levels. The following insights in designing an efficient replication algorithm emerge from the paper’s analysis. First, durability can be provided separately from availability; the former is less expensive to ensure and a more useful goal for many wide-area applications. Second, the focus of a durability algorithm must be to create new copies of data objects faster than permanent disk failures destroy the objects; careful choice of policies for what nodes should hold what data can decrease repair time. Third, increasing the number of replicas of each data object does not help a system tolerate a higher disk failure probability, but does help tolerate bursts of failures. Finally, ensuring that the system makes use of replicas that recover after temporary failure is critical to efficiency. Based on these insights, the paper proposes the Carbonite replication algorithm for keeping data durable at a low cost. A simulation of Carbonite storing 1 TB of data over a 365 day trace of PlanetLab activity shows that Carbonite is able to keep all data durable and uses 44 % more network traffic than a hypothetical system that only responds to permanent failures. In comparison, Total Recall and DHash require almost a factor of two more network traffic than this hypothetical system. 1

Citations

847 Oceanstore: An architecture for global-scale persistent storage - Kubiatowicz, Bindel, et al. - 2000
777 Wide-area cooperative storage with CFS - Dabek, Kaashoek, et al. - 2001
675 A case for redundant arrays of inexpensive disks (raid - Patterson, Gibson, et al. - 1988
637 The google file system - Ghemawat, Gobioff, et al.
593 Epidemic algorithms for replicated database maintenance - Demers, Greene, et al. - 1988
462 A blueprint for introducing disruptive technology into the Internet - PETERSON, ANDERSON, et al. - 2002
403 Serverless network file systems - Anderson, Dahlin, et al. - 1996
328 Queueing Systems. Volume I: Theory - Kleinrock - 1975
299 Thekkath. Petal: Distributed virtual disks - Lee, A - 1996
285 Handling churn in a DHT - RHEA, GEELS, et al. - 2004
248 Practical Byzantine Fault Tolerance and Proactive Recovery - Castro, Liskov
155 Replication in the Harp file system - Liskov, Ghemawat, et al. - 1991
150 Erasure coding vs. replication: A quantitative comparison - Weatherspoon, Kubiatowicz - 2002
138 Designing a DHT for low latency and high throughput - Dabek, Li, et al. - 2004
136 Scalable distributed data structures for internet service construction - Gribble, Brewer, et al. - 2000
128 Total recall: System support for automated availability management - Bhagwan, Tati, et al. - 2004
121 High availability, scalable storage, dynamic peer networks: pick two - Blake, Rodrigues - 2003
92 FAB: Building Distributed Enterprise Disk Arrays from Commodity Components - Saito, Frølund, et al. - 2004
84 The recovery manager of the System R database manager - Gray, McJones, et al. - 1981
83 Glacier: highly durable, decentralized storage despite massive correlated failures - Haeberlen, Mislove, et al. - 2005
56 Chain replication for supporting high throughput and availability - Renesse, Schneider - 2004
53 High availability in dhts: Erasure coding vs. replication - Rodrigues, Liskov - 2005
52 CoMon: a mostly-scalable monitoring system for PlanetLab - PARK, PAI
38 the OceanStore prototype - RHEA, EATON, et al. - 2003
35 Robust and efficient data management for a distributed hash table - Cates - 2003
33 Non-Transitive Connectivity and DHTs - Freedman, Lakshminarayanan, et al. - 2005
30 Analysis of long-running replicated systems - Ramabhadran, Pasquale
23 OverCite: A cooperative digital research library - STRIBLING, COUNCILL, et al. - 2005
19 SnapMirror: File System Based Asynchronous Mirroring for Disaster Recovery - Patterson, Manley, et al. - 2002
19 UsenetDHT: A low overhead Usenet server - Sit, Dabek, et al. - 2004
17 LH ∗ RS: A high-availability scalable distributed data structure using reed solomon codes - Schwarz - 2000
14 A Distributed Hash Table - DABEK - 2005
14 On object maintenance in peer-to-peer systems - TATI, VOELKER - 2006
11 Improving End-to-End Availability Using Overlay Networks - Andersen - 2005
9 Long-term data maintenance in wide-area storage systems: A quantitative approach - WEATHERSPOON, CHUN, et al. - 2005
6 Efficiently binding data to owners in distributed content-addressable storage systems - EATON, WEATHERSPOON, et al. - 2005
6 Myriad: Cost-effective disaster tolerance - LEUNG, MACCORMICK, et al. - 2002
3 Exploring the design of multi-site web services using the OverCite digital library - STRIBLING, LI, et al. - 2006
2 Centera—content addressed storage system. http:// www.emc.com/products/systems/centera.jsp. Last accessed - EMC - 2006
2 Symmetrix remote data facility. http://www.emc. com/products/networking/srdf.jsp. Last accessed - EMC - 2006
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University