Results 1 -
8 of
8
RAID: High-Performance, Reliable Secondary Storage
- ACM COMPUTING SURVEYS
, 1994
"... Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to o ..."
Abstract
-
Cited by 282 (6 self)
- Add to MetaCart
Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to organize current and future work. The paper first introduces disk technology and reviews the driving forces that have popularized disk arrays: performance and reliability. It then discusses the two architectural techniques used in disk arrays: striping across multiple disks to improve performance and redundancy to improve reliability. Next, the paper describes seven disk array architectures, called RAID (Redundant Arrays of Inexpensive Disks) levels 0-6 and compares their performance, cost, and reliability. It goes on to discuss advanced research and implementation topics such as refining the basic RAID levels to improve performance and designing algorithms to maintain data consistency. Last, the paper describes six disk array prototypes or products and discusses future opportunities for research. The paper includes an annotated bibliography of disk array-related literature.
A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems
- Software – Practice & Experience
, 1997
"... It is well-known that Reed-Solomon codes may be used to provide error correction for multiple failures in RAID-like systems. The coding technique itself, however, is not as well-known. To the coding theorist, this technique is a straightforward extension to a basic coding paradigm and needs no speci ..."
Abstract
-
Cited by 148 (26 self)
- Add to MetaCart
It is well-known that Reed-Solomon codes may be used to provide error correction for multiple failures in RAID-like systems. The coding technique itself, however, is not as well-known. To the coding theorist, this technique is a straightforward extension to a basic coding paradigm and needs no special mention. However, to the systems programmer with no training in coding theory, the technique may be a mystery. Currently, there are no references that describe how to perform this coding that do not assume that the reader is already well-versed in algebra and coding theory. This paper is intended for the systems programmer. It presents a complete specification of the coding algorithm plus details on how it may be implemented. This specification assumes no prior knowledge of algebra or coding theory. The goal of this paper is for a systems programmer to be able to implement Reed-Solomon coding for reliability in RAID-like systems without needing to consult any external references. Problem Specification Let there be storage devices, ¡£¢¥¤¦¡¨§©¤�������¤¦¡¨�, each of which holds � bytes. These are called the “Data De-vices. ” � Let there be � � more storage devices
Semantically-Smart Disk Systems
, 2003
"... We propose and evaluate the concept of a semantically-smart disk system (SDS). As opposed to a traditional "smart" disk, an SDS has detailed knowledge of how the file system above is using the disk system, including information about the on-disk data structures of the file system. An SDS exploits th ..."
Abstract
-
Cited by 64 (14 self)
- Add to MetaCart
We propose and evaluate the concept of a semantically-smart disk system (SDS). As opposed to a traditional "smart" disk, an SDS has detailed knowledge of how the file system above is using the disk system, including information about the on-disk data structures of the file system. An SDS exploits this knowledge to transparently improve performance or enhance functionality beneath a standard block read/write interface. To automatically acquire this knowledge, we introduce a tool (EOF) that can discover file-system structure for certain types of file systems, and then show how an SDS can exploit this knowledge on-line to understand file-system behavior. We quantify the space and time overheads that are common in an SDS, showing that they are not excessive. We then study the issues surrounding SDS construction by designing and implementing a number of prototypes as case studies; each case study exploits knowledge of some aspect of the file system to implement powerful functionality beneath the standard SCSI interface. Overall, we find that a surprising amount of functionality can be embedded within an SDS, hinting at a future where disk manufacturers can compete on enhanced functionality and not simply cost-per-byte and performance.
Improving Storage System Availability with D-GRAID
- In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST ’04
, 2004
"... We present the design, implementation, and evaluation of D-GRAID, a gracefully-degrading and quickly-recovering RAID storage array. D-GRAID ensures that most files within the file system remain available even when an unexpectedly high number of faults occur. D-GRAID also recovers from failures quick ..."
Abstract
-
Cited by 57 (13 self)
- Add to MetaCart
We present the design, implementation, and evaluation of D-GRAID, a gracefully-degrading and quickly-recovering RAID storage array. D-GRAID ensures that most files within the file system remain available even when an unexpectedly high number of faults occur. D-GRAID also recovers from failures quickly, restoring only live file system data to a hot spare. Both graceful degradation and live-block recovery are implemented in a prototype SCSIbased storage system underneath unmodified file systems, demonstrating that powerful "file-system like" functionality can be implemented behind a narrow block-based interface.
Analysis of Methods for Scheduling Low Priority Disk Drive Tasks
- Proc. ACM SIGMETRICS
, 2002
"... This paper analyzes various algorithms for scheduling low priority disk drive tasks. The derived closed form solution is applicable to class of greedy algorithms that include a variety of background disk scanning applications. By paying close attention to many characteristics of modern disk drives, ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
This paper analyzes various algorithms for scheduling low priority disk drive tasks. The derived closed form solution is applicable to class of greedy algorithms that include a variety of background disk scanning applications. By paying close attention to many characteristics of modern disk drives, the analytical solutions achieve very high accuracy---the difference between the predicted response times and the measurements on two different disks is only 3% for all but one examined workload. This paper also proves a theorem which shows that background tasks implemented by greedy algorithms can be accomplished with very little seek penalty. Using greedy algorithm gives a 10% shorter response time for the foreground application requests and up to a 20% decrease in total background task run time compared to results from previously published techniques.
PRO: A popularity-based multi-threaded reconstruction optimization for RAID-structured storage systems
- In Proceedings of the 5th USENIX Conference on File and Storage Technologies. USENIX Association
, 2007
"... This paper proposes and evaluates a novel dynamic data reconstruction optimization algorithm, called popularity-based multi-threaded reconstruction optimization (PRO), which allows the reconstruction process in a RAID-structured storage system to rebuild the frequently accessed areas prior to rebuil ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
This paper proposes and evaluates a novel dynamic data reconstruction optimization algorithm, called popularity-based multi-threaded reconstruction optimization (PRO), which allows the reconstruction process in a RAID-structured storage system to rebuild the frequently accessed areas prior to rebuilding infrequently accessed areas to exploit access locality. This approach has the salient advantage of simultaneously decreasing reconstruction time and alleviating user and system performance degradation. It can also be easily adopted in various conventional reconstruction approaches. In particular, we optimize the disk-oriented reconstruction (DOR) approach with PRO. The PRO-powered DOR is shown to induce a much earlier onset of response-time improvement and sustain a longer time span of such improvement than the original DOR. Our benchmark studies on read-only web workloads have shown that the PRO-powered DOR algorithm consistently outperforms the original DOR algorithm in the failurerecovery process in terms of user response time, with a 3.6%~23.9 % performance improvement and up to 44.7 % reconstruction time improvement simultaneously. 1.
Architecture and Algorithms for Scalable Wide-area Information Sytems
- Dissertation, Universuty of
, 1998
"... I owe a debt of gratitude to many who, in varying but substantial ways, have contributed in making this endeavor possible. The journey may have seemed long and daunting at first, but with the guidance and support that I was privileged to receive, I leave with only the fondest of memories. Foremost, ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
I owe a debt of gratitude to many who, in varying but substantial ways, have contributed in making this endeavor possible. The journey may have seemed long and daunting at first, but with the guidance and support that I was privileged to receive, I leave with only the fondest of memories. Foremost, I would like to thank my advisor, Prof. Harrick Vin, for his guidance, pa-tience, and most of all, his example. His contributions to this work are too many to enumerate. His constant striving for perfection and his emphasis on detail have been invaluable and inspir-ing. A large part of what I have learnt in graduate school is due to him. My sincere thanks to Prof. J. C. Browne, Prof. Mike Dahlin, Prof. Don Fussell, Prof. Al Mok, and Dr. Dan Dias for being on my doctoral committee. They have provided insights and ideas that have contributed greatly to my research. Dr. Dan Dias, despite his many respon-sibilities, found time to guide my research while I was at IBM. His cheerful spirit, constant encouragement, and keen perception made even the “real world ” seem accessible. Prof. Mike Dahlin played a crucial role in shaping and guiding my work on distributed caching. He not only influenced my approach to problem solving, but also brought attention to zero-errors in
Fault Tolerance Issues in Data Declustering for Parallel Database Systems
- Bulletin of the Technical Committee on Data Engineering
, 1994
"... Maintaining the integrity of data and its accessibility are crucial tasks in database systems. Although each component in the storage hierarchy can be fairly reliable, a large collection of such components is prone to failure; this is especially true of the secondary storage system which normally co ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Maintaining the integrity of data and its accessibility are crucial tasks in database systems. Although each component in the storage hierarchy can be fairly reliable, a large collection of such components is prone to failure; this is especially true of the secondary storage system which normally contains a large number of magnetic disks. In designing a fault tolerant secondary storage system, one should keep in mind that failures, although potentially devastating, are expected to occur fairly infrequently; hence, it is important to provide reliability techniques that do not (significantly) hinder the system's performance during normal operation. Furthermore, it is desirable to maintain a reasonable level of performance under failure as well. Since high degrees of reliability are traditionally achieved through the use of duplicate components and redundant information, it is also reasonable to use these redundancies in improving the system's performance during normal operation. In this ...

