Results 1 - 10
of
16
Deconstructing Commodity Storage Clusters
- In Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA ’05
, 2005
"... The traditional approach for characterizing complex systems is to run standard workloads and measure the resulting performance as seen by the end user. However, unique opportunities exist when characterizing a system that is itself constructed from standardized components: one can also look inside t ..."
Abstract
-
Cited by 25 (8 self)
- Add to MetaCart
The traditional approach for characterizing complex systems is to run standard workloads and measure the resulting performance as seen by the end user. However, unique opportunities exist when characterizing a system that is itself constructed from standardized components: one can also look inside the system itself by instrumenting each of the components. In this paper, we show how intra-box instrumentation can help one understand the behavior of a large-scale storage cluster, the EMC Centera. In our analysis, we leverage standard tools for tracing both the disk and network traffic emanating from each node of the cluster. By correlating this traffic with the running workload, we are able to infer the structure of the software system (e.g., its write update protocol) as well as its policies (e.g., how it performs caching, replication, and load-balancing). Further, by imposing variable intra-box delays on network and disk traffic, we can confirm the causal relationships between network and disk events. Thus, we are able to infer the semantics of the messages between nodes without examining a single line of source code. 1
An efficient coding scheme for correcting triple storage node failures
- in FAST-2005: 4th Usenix Conference on File and Storage Technologies
, 2005
"... Proper data placement schemes based on erasure correcting code are one of the most important components for a highly available data storage system. For such schemes, low decoding complexity for correcting (or recovering) storage node failures is essential for practical systems. In this paper, we des ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Proper data placement schemes based on erasure correcting code are one of the most important components for a highly available data storage system. For such schemes, low decoding complexity for correcting (or recovering) storage node failures is essential for practical systems. In this paper, we describe a new coding scheme, which we call the STAR code, for correcting triple storage node failures (erasures). The STAR code is an extension of the double-erasure-correcting EVENODD code, and a modification of the generalized triple-erasure-correcting EVENODD code. The STAR code is an MDS code, and thus is optimal in terms of node failure recovery capability for a given data redundancy. We provide detailed STAR code’s decoding algorithms for correcting various triple node failures. We show that the decoding complexity of the STAR code is much lower than those of the existing comparable codes, thus the STAR code is practically very meaningful for storage systems that need higher reliability. 1
Highly Available Distributed Storage Systems
, 1998
"... ion from Experiments. The data service time T depends on many factors in a practical server system, such as computing power (i.e., CPU speed) of the servers and the client, local disk I/O speed of the servers and bandwidth and latency of the communication medium (usually including a reliable communi ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
ion from Experiments. The data service time T depends on many factors in a practical server system, such as computing power (i.e., CPU speed) of the servers and the client, local disk I/O speed of the servers and bandwidth and latency of the communication medium (usually including a reliable communication software layer) connecting the servers and the client. A model considering all the factors will be fairly complex. In this section, we will try to model the data service time as a simple probability distribution, that can be analyzed rather easily, and yet can approximate the real data service time closely. Such a model will be abstracted from experimental results of a real data server system. Our experimental server system consists of several servers, which are PCs running Linux. Each server has data stored on its local hard disk. Data is accessed via the Linux le system. The client is also a PC running the same Linux. The nodes are connected via Myrinet switches. A sliding window ...
DiskReduce: RAID for Data-Intensive Scalable Computing
"... Data-intensive file systems, developed for Internet services and popular in cloud computing, provide high reliability and availability by replicating data, typically three copies of everything. Alternatively high performance computing, which has comparable scale, and smaller scale enterprise storage ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Data-intensive file systems, developed for Internet services and popular in cloud computing, provide high reliability and availability by replicating data, typically three copies of everything. Alternatively high performance computing, which has comparable scale, and smaller scale enterprise storage systems get similar tolerance for multiple failures from lower overhead erasure encoding, or RAID, organizations. DiskReduce is a modification of the Hadoop distributed file system (HDFS) enabling asynchronous compression of initially triplicated data down to RAID-class redundancy overheads. In addition to increasing a cluster’s storage capacity as seen by its users by up to a factor of three, DiskReduce can delay encoding long enough to deliver the performance benefits of multiple data copies. 1.
A Consistent History Link Connectivity Protocol
- Proceedings of the International Parallel Processing Symposium
, 1999
"... The RAIN (Reliable Array of Independent Nodes) project at Caltech is focusing on creating highly reliable distributed systems by leveraging commercially available personal computers, workstations and interconnect technologies. In particular, the issue of reliable communication is addressed by int ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
The RAIN (Reliable Array of Independent Nodes) project at Caltech is focusing on creating highly reliable distributed systems by leveraging commercially available personal computers, workstations and interconnect technologies. In particular, the issue of reliable communication is addressed by introducing redundancy in the form of multiple network interfaces per compute node. When using compute nodes with multiple network connections the question of how to determine connectivity between nodes arises. We examine a connectivity protocol that guarantees that each side of a point-to-point connection sees the same history of activity over the communication channel. In other words, we maintain a consistent history of the state of the communication channel. At any give moment in time the histories as seen by each side are guaranteed to be identical to within some number of transitions. This bound on how much one side may lead or lag the other is the slack. Our main contributions ar...
Diversity coloring for distributed data storage in networks. manuscript
- IEEE Transactions on Information Theory
, 2003
"... Abstract — Distributively storing widely shared files with redundancy is an important technique for high performance and fault-tolerance in information networks. This paper proposes a new file storage scheme for encoded files, aiming at satisfying highly varied QoS requirements and guaranteeing grac ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract — Distributively storing widely shared files with redundancy is an important technique for high performance and fault-tolerance in information networks. This paper proposes a new file storage scheme for encoded files, aiming at satisfying highly varied QoS requirements and guaranteeing graceful performance-degradation while some data become inaccessible. In the scheme, every node can get a file by accessing data in its proximity of a bounded radius; and the variety of data in the proximity increases steadily when the proximity’s radius increases. We formulate the file storage scheme as a new graph coloring problem which we call diversity coloring. The diversity coloring problem is shown to be NP-hard for general graphs. An algorithm using a K-interleaving technique for obtaining diversity coloring on tree networks is presented. The algorithm is of low complexity and can guarantee to output solutions that minimize the length of the codeword representing the file and also minimize the delay for any node to retrieve any amount of distinct data. Various other aspects of the algorithm, as well as properties of diversity coloring on trees and more general graphs, are also studied. Index Terms — Distributed networks, diversity coloring, file assignment, K-interleaving, tree. I.
An efficient XOR-Scheduling algorithm for erasure codes encoding
- in DSN-2009: The International Conference on Dependable Systems and Networks
, 2009
"... In large storage systems, it is crucial to protect data from loss due to failures. Erasure codes lay the foundation of this protection, enabling systems to reconstruct lost data when components fail. Erasure codes can however impose significant performance overhead in two core operations: Encoding, ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In large storage systems, it is crucial to protect data from loss due to failures. Erasure codes lay the foundation of this protection, enabling systems to reconstruct lost data when components fail. Erasure codes can however impose significant performance overhead in two core operations: Encoding, where coding information is calculated from newly written data, and decoding, where data is reconstructed after failures. This paper focuses on improving the performance of encoding, the more frequent operation. It does so by scheduling the operations of XOR-based erasure codes to optimize their use of cache memory. We call the technique XORscheduling and demonstrate how it applies to a wide variety of existing erasure codes. We conduct a performance evaluation of scheduling these codes on a variety of processors and show that XOR-scheduling significantly improves upon the traditional approach. Hence, we believe that XORscheduling has great potential to have wide impact in storage systems. 1
Diversity Coloring for Distributed Storage in Mobile Networks
, 2001
"... Abstract: Storing multiple copies of files is crucial for ensuring quality of service for data storage in mobile networks. This paper proposes a new scheme, called the K-out-of-N file distribution scheme, for the placement of files. In this scheme files are splitted, and Reed-Solomon codes or other ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract: Storing multiple copies of files is crucial for ensuring quality of service for data storage in mobile networks. This paper proposes a new scheme, called the K-out-of-N file distribution scheme, for the placement of files. In this scheme files are splitted, and Reed-Solomon codes or other maximum distance seperable (MDS) codes are used to produce file segments containing parity information. Multiple copies of the file segments are stored on gateways in the network in such a way that every gateway can retrieve enough file segments from itself and its neighbors within a certain amount of hops for reconstructing the orginal files. The goal is to minimize the maximum number of hops it takes for any gateway to get enough file segments for the file reconstruction. We formulate the K-out-of-N file distribution scheme as a coloring problem we call diversity coloring. A diversity coloring is defined to be optimal if it uses the smallest number of colors. Upper and lower bounds on the performance of diversity coloring for general graphs are studied. Diversity coloring algorithms for several special classes of graphs—trees, rings and tori—are presented, all of which have linear time complexity. Both the algorithm for trees and the algorithm for rings output optimal diversity colorings. The algorithm for tori guarantees to output optimal diversity coloring when the sizes of tori are sufficiently large. Index Terms: Data storage, diversity coloring, file assignment problem (FAP), graph coloring, K-out-of-N scheme, maximum distance seperable (MDS) codes, mobile computing, Quality of Service
Data Consistent Up- and Downstreaming in a Distributed Storage System
"... Distribution of large data objects among several storage servers is a common technique to speed up access rates. In combination with parity schemes, failures of single server nodes can be tolerated, so that such systems reach a certain degree of fault tolerance. In this paper such a distributed serv ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Distribution of large data objects among several storage servers is a common technique to speed up access rates. In combination with parity schemes, failures of single server nodes can be tolerated, so that such systems reach a certain degree of fault tolerance. In this paper such a distributed server system is analyzed. Data objects are stored in a data layout according to RAID level 3 among disk subsystems of different computers. An access control provides concurrent up- and down-streaming of data objects to/from the distributed storage system with ensured data consistency. This consistency control is described in combination with the handling of faulty server nodes and faulty clients. Furthermore, performance is measured with several access patterns. An application of that technique is for instance a distributed video server, allowing permanently updates without interrupting access.
Reliable and Secure Distributed Storage Using Erasure Codes
"... Large-scale enterprise data is increasingly organized in the form of distributed storage. The three important issues that arise when moving to distributed storage are reliability (data should survive the failure of a small number of disks), security (only authorized clients should access storage) an ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Large-scale enterprise data is increasingly organized in the form of distributed storage. The three important issues that arise when moving to distributed storage are reliability (data should survive the failure of a small number of disks), security (only authorized clients should access storage) and concurrency (multiple clients should be able to access storage simultaneously).

