Results 1 - 10
of
27
UNIX Disk Access Patterns
, 1993
"... Disk access patterns are becoming ever more important to understand as the gap between processor and disk performance increases. The study presented here is a detailed characterization of every lowlevel disk access generated by three quite different systems over a two month period. The contributions ..."
Abstract
-
Cited by 242 (20 self)
- Add to MetaCart
Disk access patterns are becoming ever more important to understand as the gap between processor and disk performance increases. The study presented here is a detailed characterization of every lowlevel disk access generated by three quite different systems over a two month period. The contributions of this paper are the detailed information we provide about the disk accesses on these systems (many of our results are significantly different from those reported in the literature, which provide summary data only for file-level access on small-memory systems); and the analysis of a set of optimizations that could be applied at the disk level to improve performance. Our traces show that the majority of all operations are writes; disk accesses are rarely sequential; 25-- 50% of all accesses are asynchronous; only 13--41% of accesses are to user data (the rest result from swapping, metadata, and program execution); and I/O activity is very bursty: mean request queue lengths seen by an incoming request range from 1.7 to 8.9 (1.2--1.9 for reads, 2.0--14.8 for writes), while we saw 95th percentile queue lengths as large as 89 entries, and maxima of over 1000. Using a simulator to analyze the effect of write caching at the disk level, we found that using a small non-volatile cache at each disk allowed writes to be serviced considerably faster than with a regular disk. In particular, short bursts of writes go much faster -- and such bursts are common: writes rarely come singly. Adding even 8KB of non-volatile memory per disk could reduce disk traffic by 10-- 18%, and 90% of metadata write traffic can be absorbed with as little as 0.2MB per disk of nonvolatile RAM. Even 128KB of NVRAM cache in each disk can improve write performance by as much as a factor of three. FCFS scheduling...
The HP AutoRAID hierarchical storage system
- ACM Transactions on Computer Systems
, 1995
"... Configuring redundant disk arrays is a black art. To configure an array properly, a system administrator must understand the details of both the array and the workload it will support. Incorrect understanding of either, or changes in the workload over time, can lead to poor performance. We present a ..."
Abstract
-
Cited by 218 (14 self)
- Add to MetaCart
Configuring redundant disk arrays is a black art. To configure an array properly, a system administrator must understand the details of both the array and the workload it will support. Incorrect understanding of either, or changes in the workload over time, can lead to poor performance. We present a solution to this problem: a two-level storage hierarchy implemented inside a single diskarray controller. In the upper level of this hierarchy, two copies of active data are stored to provide full redundancy and excellent performance. In the lower level, RAID 5 parity protection is used to provide excellent storage cost for inactive data, at somewhat lower performance. The technology we describe in this paper, known as HP AutoRAID, automatically and transparently manages migration of data blocks between these two levels as access patterns change. The result is a fully redundant storage system that is extremely easy to use, is suitable for a wide variety of workloads, is largely insensitive to dynamic workload changes, and performs much better than disk arrays with comparable numbers of spindles and much larger amounts of front-end RAM cache. Because the implementation of the HP AutoRAID technology is almost entirely in software, the additional hardware cost for these benefits is very small. We describe the HP AutoRAID technology in detail, provide performance data for an embodiment of it in a storage array, and summarize the results of simulation studies used to choose algorithms implemented in the array.
Long Term Distributed File Reference Tracing: Implementation and Experience
, 1994
"... DFSTrace is a system to collect and analyze long-term file reference data in a distributed UNIX workstation environment. The design of DFSTrace is unique in that it pays particular attention to efficiency, extensibility, and the logistics of long-term trace data collection in a distributed environme ..."
Abstract
-
Cited by 82 (3 self)
- Add to MetaCart
DFSTrace is a system to collect and analyze long-term file reference data in a distributed UNIX workstation environment. The design of DFSTrace is unique in that it pays particular attention to efficiency, extensibility, and the logistics of long-term trace data collection in a distributed environment. The components of DFSTrace are a set of kernel hooks, a kernel buffer mechanism, a data extraction agent, a set of collection servers, and post-processing tools. Our experience with DFSTrace has been highly positive. Tracing has been virtually unnoticeable, degrading performance 3-7%, depending on the level of detail of tracing. We have collected file reference traces from approximately 30 workstations continuously for over two years. We have implemented a post-processing library to provide a convenient programmer interface to the traces, and have created an on-line database of results from a suite of analysis programs to aid trace selection. Our data has been used for a wide variety of purposes, including file system studies, performance measurement and tuning, and debugging. Extensions of DFSTrace have enabled its use in applications such as field reliability testing and determining disk geometry. This paper presents the design, implementation, and evaluation of DFSTrace and associated tools, and describes how they have been used.
Dynamic File-Access Characteristics of a Production Parallel Scientific Workload
, 1994
"... Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to treme ..."
Abstract
-
Cited by 76 (12 self)
- Add to MetaCart
Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors. Most successful systems are based on a solid understanding of the expected workload, but thus far there have been no comprehensive workload characterizations of multiprocessor le systems. This paper presents the results of a three week tracing study in which all file-related activity on a massively parallel computer was recorded. Our instrumentation di ers from previous efforts in that it collects information about every I/O request and about the mix of jobs running in a production environment. We also present the results of a trace-driven caching simulation and recommendations for designers of multiprocessor file systems.
Detection and Exploitation of File Working Sets
- Proc. of the 11th Int. Conf. on Distributed Computing Systems
, 1991
"... The work habits of most individuals yield file access patterns that are quite pronounced and can be regarded as defining working sets of files used for particular applications. This paper describes a client-side cache management technique for detecting these patterns and then exploiting them to succ ..."
Abstract
-
Cited by 66 (5 self)
- Add to MetaCart
The work habits of most individuals yield file access patterns that are quite pronounced and can be regarded as defining working sets of files used for particular applications. This paper describes a client-side cache management technique for detecting these patterns and then exploiting them to successfully prefetch files from servers. Trace-driven simulations show the technique substantially increases the hit rate of a client file cache in an environment in which a client workstation is dedicated to a single user. Successful file prefetching carries three major advantages: (1) applications run faster, (2) there is less "burst" load placed on the network, and (3) properly-loaded client caches can better survive network outages. Our technique requires little extra code, and --- because it is simply an augmentation of the standard LRU client cache management algorithm --- is easily incorporated into existing software. This work is supported by the New York State Science and Technology Fo...
Practical prefetching techniques for multiprocessor le systems
- Journal of Distributed and Parallel Databases
, 1993
"... Abstract. Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to dose the gap between processor and disk speeds. In a previous paper we showed that prefetching and caching have th ..."
Abstract
-
Cited by 45 (6 self)
- Add to MetaCart
Abstract. Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to dose the gap between processor and disk speeds. In a previous paper we showed that prefetching and caching have the potentT"al to deliver the performance benefits of parallel file systems to parallel applications. In this paper we describe experiments with practical prefetching policies that base decisions only on on-line reference history, and that can be implemented efficiently. We also test the ability of those policies across a range of architectural parameters. Keywords: multiprocessor file systems, parallel I/O, file caching, prefetching 1.
Characterizing parallel file-access patterns on a large-scale multiprocessor
- IN PROCEEDINGS OF THE NINTH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM
, 1995
"... Rapid increases in the computational speeds of multiprocessors have not been matched by correspond-ing performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwi ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
Rapid increases in the computational speeds of multiprocessors have not been matched by correspond-ing performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-v01ume data transfer between tth I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected " workload. So far there have been no-comprehensive usage studies of multiprocessor file systems. Our _. CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at.. _i ",-' NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First, the file system should support efficient concurrent access to many files, and UO requests from many jobs
Dynamic Metadata Management for Petabyte-scale File Systems
"... In petabyte-scale distributed file systems that decouple read and write from metadata operations, behavior of the metadata server cluster will be critical to overall system performance and scalability. We present a dynamic subtree partitioning and adaptive metadata management system designed to effi ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
In petabyte-scale distributed file systems that decouple read and write from metadata operations, behavior of the metadata server cluster will be critical to overall system performance and scalability. We present a dynamic subtree partitioning and adaptive metadata management system designed to efficiently manage hierarchical metadata workloads that evolve over time. We examine the relative merits of our approach in the context of traditional workload partitioning strategies, and demonstrate the performance, scalability and adaptability advantages in a simulation environment.
A High Performance Multi-Structured File System Design
, 1991
"... File system I/O is increasingly becoming a performance bottleneck in large distributed computer systems. lais is due to the increased file I/O demands of new applications, the inability of any single storage structure to respond to these demands, and the slow decline of disk access times (latency an ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
File system I/O is increasingly becoming a performance bottleneck in large distributed computer systems. lais is due to the increased file I/O demands of new applications, the inability of any single storage structure to respond to these demands, and the slow decline of disk access times (latency and seek) relative to the rapid increase in CPU speeds, memory size, and network bandwidth.
I/O Reference Behavior of Production Database Workloads and the TPC Benchmarks - An Analysis at the Logical Level
- ACM Transactions on Database Systems
, 2001
"... As improvements in processor performance continue to far outpace improvements in storage performance, I /O is increasingly the bottleneck in computer systems, especially in large database systems that manage huge amounts of data. The key to achieving good I /O performance is to thoroughly understand ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
As improvements in processor performance continue to far outpace improvements in storage performance, I /O is increasingly the bottleneck in computer systems, especially in large database systems that manage huge amounts of data. The key to achieving good I /O performance is to thoroughly understand its characteristics. In this article we present a comprehensive analysis of the logical I/O reference behavior of the peak production database workloads from ten of the world’s largest corporations. In particular, we focus on how these workloads respond to different techniques for caching, prefetching, and write buffering. Our findings include several broadly applicable rules of thumb that describe how effective the various I /O optimization techniques are for the production workloads. For instance, our results indicate that the buffer pool miss ratio tends to be related to the ratio of buffer pool size to data size by an inverse square root rule. A similar fourth root rule relates the write miss ratio and the ratio of buffer pool size to data size. In addition, we characterize the reference characteristics of workloads similar to the Transaction Processing Performance Council (TPC) benchmarks C (TPC-C) and D (TPC-D), which are de facto standard performance measures for online transaction processing (OLTP) systems and decision support systems (DSS), respectively. Since benchmarks such as TPC-C and TPC-D can only be

