Results 1 - 10
of
100
RAID: High-Performance, Reliable Secondary Storage
- ACM COMPUTING SURVEYS
, 1994
"... Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to o ..."
Abstract
-
Cited by 281 (6 self)
- Add to MetaCart
Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to organize current and future work. The paper first introduces disk technology and reviews the driving forces that have popularized disk arrays: performance and reliability. It then discusses the two architectural techniques used in disk arrays: striping across multiple disks to improve performance and redundancy to improve reliability. Next, the paper describes seven disk array architectures, called RAID (Redundant Arrays of Inexpensive Disks) levels 0-6 and compares their performance, cost, and reliability. It goes on to discuss advanced research and implementation topics such as refining the basic RAID levels to improve performance and designing algorithms to maintain data consistency. Last, the paper describes six disk array prototypes or products and discusses future opportunities for research. The paper includes an annotated bibliography of disk array-related literature.
The Gamma database machine project
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the arc ..."
Abstract
-
Cited by 203 (27 self)
- Add to MetaCart
This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the architecture to be scaled to 100s of processors. First, all relations are horizontally partitioned across multiple disk drives enabling relations to be scanned in parallel. Second, novel parallel algorithms based on hashing are used to implement the complex relational operators such as join and aggregate functions. Third, dataflow scheduling techniques are used to coordinate multioperator queries. By using these techniques it is possible to control the execution of very complex queries with minimal coordination- a necessity for configurations involving a very large number of processors. In addition to describing the design of the Gamma software, a thorough performance evaluation of the iPSC/2 hypercube version of Gamma is also presented. In addition to measuring the effect of relation size and indices on the response time for selection, join, aggregation, and update queries, we also analyze the performance of Gamma relative to the number of processors employed when the sizes of the input relations are kept constant (speedup) and when the sizes of the input relations are increased proportionally to the number of processors (scaleup). The speedup results obtained for both selection and join queries are linear; thus, doubling the number of processors
Main memory database systems: An overview
- IEEE Transactions on Knowledge and Data Engineering
, 1992
"... Abstract-Memory resident database systems (MMDB’s) store their data in main physical memory and provide very high-speed access. Conventional database systems are optimized for the particular characteristics of disk storage mechanisms. Memory resident systems, on the other hand, use different optimiz ..."
Abstract
-
Cited by 155 (2 self)
- Add to MetaCart
Abstract-Memory resident database systems (MMDB’s) store their data in main physical memory and provide very high-speed access. Conventional database systems are optimized for the particular characteristics of disk storage mechanisms. Memory resident systems, on the other hand, use different optimizations to structure and organize data, as well as to make it reliable. This paper surveys the major memory residence optimizations and briefly discusses some of the memory resident systems that have been designed or implemented. Index Terms- Access methods, application programming in-terface, commit processing, concurrency control, data clustering, data representation, main memory database system (MMDB), query processing, recovery. Invited Paper I.
Disk Shadowing
- In Proc. of the Fourteenth International Conference on Very Large Data Bases (Los
, 1988
"... cupertino California Disk shadowing is a technique for maintaining a set of two or more identical disk images on separate disk devices. Its primary purpose is to enhance reliability and availability of secondary storage by providing multiple paths to redundant data. However, shadowing can also boost ..."
Abstract
-
Cited by 137 (5 self)
- Add to MetaCart
cupertino California Disk shadowing is a technique for maintaining a set of two or more identical disk images on separate disk devices. Its primary purpose is to enhance reliability and availability of secondary storage by providing multiple paths to redundant data. However, shadowing can also boost UO performance. In this paper, we contend that intelligent device scheduling of shadowed disks increases the I/O rate, by allowing parallel reads and by substantially reducing the average seek time for random reads. In particular, we develop an analytic model which shows that the seek time for a random read in a shadow set is a monotonic decreasing function of the number of disks in the set. 1.
File server scaling with network-attached secure disks
- In Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems
, 1997
"... By providing direct data transfer between storage and client, net-work-attached storage devices have the potential to improve scal-ability for existing distributed file systems (by removing the server as a bottleneck) and bandwidth for new parallel and distributed file systems (through network strip ..."
Abstract
-
Cited by 129 (10 self)
- Add to MetaCart
By providing direct data transfer between storage and client, net-work-attached storage devices have the potential to improve scal-ability for existing distributed file systems (by removing the server as a bottleneck) and bandwidth for new parallel and distributed file systems (through network striping and more efficient data paths). Together, these advantages influence a large enough fraction of the storage market to make commodity network-attached storage fea-sible. Realizing the technology’s full potential requires careful consideration across a wide range of file system, networking and security issues. This paper contrasts two network-attached storage architectures-(l) Networked SCSI disks (NetSCSI) are network-attached storage devices with minimal changes from the familiar SCSI interface, while (2) Network-Attached Secure Disks (NASD) are drives that support independent client access to drive object services. To estimate the potential performance benefits of these architectures, we develop an analytic model and perform trace-driven replay experiments based on AFS and NFS traces. Our results suggest that NetSCSI can reduce tile server load during a burst of NFS or AFS activity by about 30%. With the NASD archi-tecture, server load (during burst activity) can be reduced by a fac-tor of up to five for AFS and up to ten for NFS. 1
Beating the I/O Bottleneck: A Case for Log-Structured File Systems
- Operating Systems Review
, 1988
"... CPU speeds are improving at a dramatic rate, while disk speeds are not. This technology shift suggests that many engineering and office applications may become so I/O-limited that they cannot benefit from further CPU improvements. This paper discusses several techniques for improving I/O performance ..."
Abstract
-
Cited by 128 (2 self)
- Add to MetaCart
CPU speeds are improving at a dramatic rate, while disk speeds are not. This technology shift suggests that many engineering and office applications may become so I/O-limited that they cannot benefit from further CPU improvements. This paper discusses several techniques for improving I/O performance, including caches, battery-backed-up caches, and cache logging. We then examine in particular detail an approach called log-structured file systems, where the file system's only representation on disk is in the form of an append-only log. Log-structured file systems potentially provide order-of-magnitude improvements in write performance. When log-structured file systems are combined with arrays of small disks (which provide high bandwidth) and large main-memory file caches (which satisfy most read accesses), we believe it will be possible to achieve 1000-fold improvements in I/O performance over today's systems. ############################# The work described here was supported in part b...
Maximizing Performance in a Striped Disk Array
, 1990
"... Improvements in disk speeds have not kept up with improvements in processor and memory speeds. One way to correct the resulting speed mismatch is to stripe data across many disks. In this paper, we address how to stripe data to get maximum performance from the disks. Specifically, we examine how to ..."
Abstract
-
Cited by 125 (10 self)
- Add to MetaCart
Improvements in disk speeds have not kept up with improvements in processor and memory speeds. One way to correct the resulting speed mismatch is to stripe data across many disks. In this paper, we address how to stripe data to get maximum performance from the disks. Specifically, we examine how to choose the striping unit, i.e. the amount of logically contiguous data on each disk. We synthesize rules for determining the best striping unit for a given range of workloads. We show how the choice of striping unit depends on only two parameters: 1) the number of outstanding requests in the disk system at any given time, and 2) the average positioning time data transfer rate of the disks. We derive an equation for the optimal striping unit as a function of these two parameters; we also show how to choose the striping unit without prior knowledge about the workload.
I/O issues in a multimedia system
- IEEE Computer
, 1994
"... In this paper, we look at the various I/O issues in a multimedia system. In a multimedia server, the disk requests may have constant data rate requirements and need guaranteed service. We study the impact of the real-time nature of the I/O requests on the various components of the I/O system. We stu ..."
Abstract
-
Cited by 118 (4 self)
- Add to MetaCart
In this paper, we look at the various I/O issues in a multimedia system. In a multimedia server, the disk requests may have constant data rate requirements and need guaranteed service. We study the impact of the real-time nature of the I/O requests on the various components of the I/O system. We study the impact of disk scheduling algorithms on the performance of a multimedia system. We investigate the impact of buffer space on the maximum number of video streams that can be supported. We show that by making the deadlines larger than the request periods, a larger number of streams can be supported. We also show how deadline extension helps in utilizing multiple disks on a single SCSI bus. 2 Introduction Current computer systems put emphasis on processor performance with little attention paid to the I/O system. There have been several recent studies at improving the I/O system performance [1, 2, 3, 4]. These studies concentrate on improving the I/O throughput. Future I/O systems will ...
Disk Scheduling in a Multimedia I/O system
- in Proceedings of ACM Multimedia'93
, 1993
"... This article provides a retrospective of our original paper by the same title in the Proceedings of the First ACM Conference on Multimedia, published in 1993. This article examines the problem of disk scheduling in a multimedia I/O system. In a multimedia server, the disk requests may have constant ..."
Abstract
-
Cited by 114 (0 self)
- Add to MetaCart
This article provides a retrospective of our original paper by the same title in the Proceedings of the First ACM Conference on Multimedia, published in 1993. This article examines the problem of disk scheduling in a multimedia I/O system. In a multimedia server, the disk requests may have constant data rate requirements and need guaranteed service. We propose a new scheduling algorithm, SCAN-EDF, that combines the features of SCAN type of seek optimizing algorithm with an Earliest Deadline First (EDF) type of real-time scheduling algorithm. We compare SCAN-EDF with other scheduling strategies and show that SCAN-EDF combines the best features of both SCAN and EDF. We also investigate the impact of buffer space on the maximum number of video streams that can be supported. We show that by making the deadlines larger than the request periods, a larger number of streams can be supported. We also describe how we extended the SCAN-EDF algorithm in the PRISM multimedia architecture. PRISM is an integrated multimedia server, designed to satisfy the QOS requirements of multiple classes of requests. Our experience in implementing the extended SCAN-EDF algorithm in a generic operating system is discussed and performance metrics and results are presented to illustrate how the SCAN-EDF extensions and implementation strategies have succeeded in meeting the QOS requirements of different classes of requests.
Chained Declustering: A New Availability Strategy for Multiprocssor Database
- IN PROCEEDINGS OF 6TH INTERNATIONAL DATA ENGINEERING CONFERENCE
, 1990
"... This paper presents a new strategy for increasing the availability of data in multi-processor, shared-nothing database machines. This technique, termed chained declustering, is demonstrated to provide superior performance in the event of failures while maintaining a very high degree of data availabi ..."
Abstract
-
Cited by 112 (6 self)
- Add to MetaCart
This paper presents a new strategy for increasing the availability of data in multi-processor, shared-nothing database machines. This technique, termed chained declustering, is demonstrated to provide superior performance in the event of failures while maintaining a very high degree of data availability. Furthermore, unlike most earlier replication strategies, the implementation of chained declustering requires no special hardware and only minimal modifications to existing software.

