Results 1 - 10
of
15
Erasure Coding vs. Replication: A Quantitative Comparison
- In Proceedings of the First International Workshop on Peer-to-Peer Systems (IPTPS 2002
, 2002
"... Abstract. Peer-to-peer systems are positioned to take advantage of gains in network bandwidth, storage capacity, and computational resources to provide longterm durable storage infrastructures. In this paper, we quantitatively compare building a distributed storage infrastructure that is self-repair ..."
Abstract
-
Cited by 152 (11 self)
- Add to MetaCart
Abstract. Peer-to-peer systems are positioned to take advantage of gains in network bandwidth, storage capacity, and computational resources to provide longterm durable storage infrastructures. In this paper, we quantitatively compare building a distributed storage infrastructure that is self-repairing and resilient to faults using either a replicated system or an erasure-resilient system. We show that systems employing erasure codes have mean time to failures many orders of magnitude higher than replicated systems with similar storage and bandwidth requirements. More importantly, erasure-resilient systems use an order of magnitude less bandwidth and storage to provide similar system durability as replicated systems. 1
Transaction Support in Read Optimized and Write Optimized File Systems
- Proceedings of the 16th International Conference on Very Large Data Bases
, 1990
"... This paper provides a comparative analysis of five implementations of transaction support. The first of the methods is the traditional approach of implementing transaction processing within a data manager on top of a read optimized file system. The second also assumes a traditional file system but e ..."
Abstract
-
Cited by 24 (5 self)
- Add to MetaCart
This paper provides a comparative analysis of five implementations of transaction support. The first of the methods is the traditional approach of implementing transaction processing within a data manager on top of a read optimized file system. The second also assumes a traditional file system but embeds transaction support inside the file system. The third model considers a traditional data manager on top of a write optimized file system. The last two models both embed transaction support inside a write optimized file system, each using a different logging mechanism. Our results show that in a transaction processing environment, a write optimized file system often yields better performance than one optimized for reads. In addition, we show that file system embedded transaction managers can perform as well as data managers when transaction throughput is limited by I/O bandwidth. Finally, even when the CPU is the critical resource, the difference in performance between a data manager an...
Erasure Code Replication Revisited
- In PTP04: 4th International Conference on Peer-to-Peer Computing. IEEE
, 2004
"... Erasure coding is a technique for achieving high availability and reliability in storage and communication systems. In this paper, we revisit the analysis of erasure code replication and point out some situations when whole-file replication is preferred. The switchover point (from preferring whole-f ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Erasure coding is a technique for achieving high availability and reliability in storage and communication systems. In this paper, we revisit the analysis of erasure code replication and point out some situations when whole-file replication is preferred. The switchover point (from preferring whole-file replication to erasure code replication) is studied, and characterized using asymptotic analysis. We also discuss the additional considerations in building erasure code replication systems. 1
Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems
- In Proc of FuDiCo
, 2002
"... Peer-to-peer systems are positioned to take advantage of gains in network bandwidth, storage capacity, and computational resources to provide long-term durable storage infrastructures. In this paper, we contribute a naming technique to allow an erasure encoded document to be self-verified by the cli ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
Peer-to-peer systems are positioned to take advantage of gains in network bandwidth, storage capacity, and computational resources to provide long-term durable storage infrastructures. In this paper, we contribute a naming technique to allow an erasure encoded document to be self-verified by the client or any other component in the system.
RACS: A Case for Cloud Storage Diversity
"... The increasing popularity of cloud storage is leading organizations to consider moving data out of their own data centers and into the cloud. However, success for cloud storage providers can present a significant risk to customers; namely, it becomes very expensive to switch storage providers. In th ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The increasing popularity of cloud storage is leading organizations to consider moving data out of their own data centers and into the cloud. However, success for cloud storage providers can present a significant risk to customers; namely, it becomes very expensive to switch storage providers. In this paper, we make a case for applying RAID-like techniques used by disks and file systems, but at the cloud storage level. We argue that striping user data across multiple providers can allow customers to avoid vendor lock-in, reduce the cost of switching providers, and better tolerate provider outages or failures. We introduce RACS, a proxy that transparently spreads the storage load over many providers. We evaluate a prototype of our system and estimate the costs incurred and benefits reaped. Finally, we use trace-driven simulations to demonstrate how RACS can reduce the cost of switching storage vendors for a large organization such as the Internet Archive by seven-fold or more by varying erasure-coding parameters.
Using Remote Memory to Stabilise Data Efficiently on an EXT2 Linux File System
, 2002
"... Stable storage is an important requirement for many applications. It is usually implemented over traditional file systems via synchronous write operations to disk. ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Stable storage is an important requirement for many applications. It is usually implemented over traditional file systems via synchronous write operations to disk.
Read Optimized File System Designs: A Performance Evaluation
- In Proceedings of the Seventh International Conference on Data Engineering
, 1991
"... This paper presents a performance comparison of several file system allocation policies. The file systems are designed to provide high bandwidth between disks and main memory by taking advantage of parallelism in an underlying disk array, catering to large units of transfer, and minimizing the bandw ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents a performance comparison of several file system allocation policies. The file systems are designed to provide high bandwidth between disks and main memory by taking advantage of parallelism in an underlying disk array, catering to large units of transfer, and minimizing the bandwidth dedicated to the transfer of meta data. All of the file systems described use a multiblock allocation strategy which allows both large and small files to be allocated efficiently. Simulation results show that these multiblock policies result in systems that are able to utilize a large percentage of the underlying disk bandwidth; more than 90% in sequential cases. As general purpose systems are called upon to support more data intensive applications such as databases and supercomputing, these policies offer an opportunity to provide superior performance to a larger class of users. 1. Introduction Most current file systems can be divided into two distinct categories: fixed block systems ...
Recovery of Commodity Multi-Site Email Clusters
, 2005
"... Abstract — Internet service providers need ways to improve the dependability of their email services. Existing single-site and multi-site, synchronously replicated message stores have low availability. Recovery of multi-site, asynchronously replicated stores is slow and inconsistent. This paper’s tw ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract — Internet service providers need ways to improve the dependability of their email services. Existing single-site and multi-site, synchronously replicated message stores have low availability. Recovery of multi-site, asynchronously replicated stores is slow and inconsistent. This paper’s two-dimensional Markov analysis shows that the availability of an email service is fundamentally limited by site mean-time-to-failure (MTTF), not by the message store MTTF. Replication of the message store at two and three sites can improve the service MTTF by two and four orders of magnitude respectively. One order of magnitude improvement in the storage mean-time-to-repair for two- and three-site replication improves the service MTTF by one and two orders of magnitude respectively. Improvements in the MTTF of current storage technology will not significantly improve email service availability. Keywords- email, availability, distributed hash tables, Markov analysis
Keywords Peer-to-peer · Distributed hash table · Redundancy · Replication · Erasure coding
, 2007
"... Abstract In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, we take user download behavior into account. Furthermore, we propose a hyb ..."
Abstract
- Add to MetaCart
Abstract In order to provide high data availability in peer-to-peer (P2P) DHTs, proper data redundancy schemes are required. This paper compares two popular schemes: replication and erasure coding. Unlike previous comparison, we take user download behavior into account. Furthermore, we propose a hybrid redundancy scheme, which shares user downloaded files for subsequent accesses and utilizes erasure coding to adjust file availability. Comparison experiments of three schemes show that replication saves more bandwidth than erasure coding, although it requires more storage space, when average node availability is higher than 47%; moreover, our hybrid scheme saves more maintenance bandwidth with acceptable redundancy factor.

