• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Practical prefetching techniques for parallel file systems (1991)

by D Kotz, C Ellis
Venue:In Proc. ICPADS
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 35
Next 10 →

Informed Prefetching and Caching

by R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, Jim Zelenka - In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles , 1995
"... The underutilization of disk parallelism and file cache buffers by traditional file systems induces I/O stall time that degrades the performance of modern microprocessor-based systems. In this paper, we present aggressive mechanisms that tailor file system resource management to the needs of I/O-int ..."
Abstract - Cited by 321 (8 self) - Add to MetaCart
The underutilization of disk parallelism and file cache buffers by traditional file systems induces I/O stall time that degrades the performance of modern microprocessor-based systems. In this paper, we present aggressive mechanisms that tailor file system resource management to the needs of I/O-intensive applications. In particular, we show how to use application-disclosed access patterns (hints) to expose and exploit I/O parallelism and to allocate dynamically file buffers among three competing demands: prefetching hinted blocks, caching hinted blocks for reuse, and caching recently used data for unhinted accesses. Our approach estimates the impact of alternative buffer allocations on application execution time and applies a cost-benefit analysis to allocate buffers where they will have the greatest impact. We implemented informed prefetching and caching in DEC’s OSF/1 operating system and measured its performance on a 150 MHz Alpha equipped with 15 disks running a range of applications including text search, 3D scientific visualization, relational database queries, speech recognition, and computational chemistry. Informed prefetching reduces the execution time of the first four of these applications by 20 % to 87%. Informed caching reduces the execution time of the fifth application by up to 30%.

RAID: High-Performance, Reliable Secondary Storage

by Peter M. Chen, Edward K. Lee, Garth A. Gibson, Randy H. Katz, David A. Patterson - ACM COMPUTING SURVEYS , 1994
"... Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to o ..."
Abstract - Cited by 282 (6 self) - Add to MetaCart
Disk arrays were proposed in the 1980s as a way to use parallelism between multiple disks to improve aggregate I/O performance. Today they appear in the product lines of most major computer manufacturers. This paper gives a comprehensive overview of disk arrays and provides a framework in which to organize current and future work. The paper first introduces disk technology and reviews the driving forces that have popularized disk arrays: performance and reliability. It then discusses the two architectural techniques used in disk arrays: striping across multiple disks to improve performance and redundancy to improve reliability. Next, the paper describes seven disk array architectures, called RAID (Redundant Arrays of Inexpensive Disks) levels 0-6 and compares their performance, cost, and reliability. It goes on to discuss advanced research and implementation topics such as refining the basic RAID levels to improve performance and designing algorithms to maintain data consistency. Last, the paper describes six disk array prototypes or products and discusses future opportunities for research. The paper includes an annotated bibliography of disk array-related literature.

PPFS: A High Performance Portable Parallel File System

by James V. Huber, Jr., Christopher L. Elford, Daniel A. Reed, Andrew A. Chien, David S. Blumenthal - In Proceedings of the 9th ACM International Conference on Supercomputing , 1995
"... Rapid increases in processor performance over the past decade have outstripped performance improvements in input/output devices, increasing the importance of input /output performance to overall system performance. Further, experience has shown that the performance of parallel input/output systems i ..."
Abstract - Cited by 122 (13 self) - Add to MetaCart
Rapid increases in processor performance over the past decade have outstripped performance improvements in input/output devices, increasing the importance of input /output performance to overall system performance. Further, experience has shown that the performance of parallel input/output systems is particularly sensitive to data placement and data management policies, making good choices critical. To explore this vast design space, we have developed a user-level library, the Portable Parallel File System (PPFS), which supports rapid experimentation and exploration. The PPFS includes a rich application interface, allowing the application to advertise access patterns, control caching and prefetching, and even control data placement. PPFS is both extensible and portable, making possible a wide range of experiments on a broad variety of platforms and configurations. Our initial experiments, based on simple benchmarks and two application programs, show that tailoring policies to input/out...

An Analytical Approach to File Prefetching

by Hui Lei, Dan Duchamp - In Proceedings of the USENIX 1997 Annual Technical Conference , 1997
"... File prefetching is an effective technique for improving file access performance. In this paper, we present a file prefetching mechanism that is based on on-line analytic modeling of interesting system events and is transparent to higher levels. The mechanism, incorporated into a client's file cache ..."
Abstract - Cited by 115 (0 self) - Add to MetaCart
File prefetching is an effective technique for improving file access performance. In this paper, we present a file prefetching mechanism that is based on on-line analytic modeling of interesting system events and is transparent to higher levels. The mechanism, incorporated into a client's file cache manager, seeks to build semantic structures that capture the intrinsic correlations between file accesses. It then heuristically uses these structures to represent distinct file usage patterns and exploits them to prefetch files from a file server. We show results of a simulation study and of a working implementation. Measurements suggest that our method can predict future file accesses with an accuracy around 90%, that it can reduce cache miss rate by up to 47% and application latency by up to 40%. Our method imposes little overhead, even under antagonistic circumstances. 1 Introduction This paper reports the effectiveness of a predictive file prefetching technique that operates automat...

Automatic I/O Hint Generation through Speculative Execution

by Fay Chang, Garth A. Gibson - PROCEEDINGS OF THE 3RD SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION , 1999
"... Aggressive prefetching is an effective technique for reducing the execution times of disk-bound applications; that is, applications that manipulate data too large or too infrequently used to be found in file or disk caches. While automatic prefetching approaches based on static analysis or historica ..."
Abstract - Cited by 89 (2 self) - Add to MetaCart
Aggressive prefetching is an effective technique for reducing the execution times of disk-bound applications; that is, applications that manipulate data too large or too infrequently used to be found in file or disk caches. While automatic prefetching approaches based on static analysis or historical access patterns are effective for some workloads, they are not as effective as manually-driven (programmer-inserted) prefetching for applications with irregular or input-dependent access patterns. In this paper, we propose to exploit whatever processor cycles are left idle while an application is stalled on I/O by using these cycles to dynamically analyze the application and predict its future I/O accesses. Our approach is to speculatively pre-execute the application’s code in order to discover and issue hints for its future read accesses. Coupled with an

Multiprocessor file system interfaces

by David Kotz - In Proceedings of the Second International Conference on Parallel and Distributed Information Systems , 1993
"... Increasingly, le systems for multiprocessors are designed with parallel access to multiple disks, to keep I/O from becoming a serious bottleneck for parallel applications. Although le system software can transparently provide high-performance access to parallel disks, a new le system interface is ne ..."
Abstract - Cited by 50 (5 self) - Add to MetaCart
Increasingly, le systems for multiprocessors are designed with parallel access to multiple disks, to keep I/O from becoming a serious bottleneck for parallel applications. Although le system software can transparently provide high-performance access to parallel disks, a new le system interface is needed to facilitate parallel access to a le from a parallel application. We describe the di culties faced when using the conventional (Unix-like) interface in parallel applications, and then outline ways to extend the conventional interface to provide convenient access to the le for parallel programs, while retaining the traditional interface for programs that have no need for explicitly parallel le access. Our interface includes a single naming scheme, a multiopen operation, local and global le pointers, mapped le pointers, logical records, multi les, and logical coercion for backward compatibility. 1

A Status Report on Research in Transparent Informed Prefetching

by R. Hugo Patterson, Garth A. Gibson, M. Satyanarayanan - ACM Operating Systems Review , 1993
"... This paper focuses on extending the power of caching and prefetching to reduce file read latencies by exploiting application level hints about future I/O accesses. We argue that systems that disclose high-level knowledge can transfer optimization information across module boundaries in a manner cons ..."
Abstract - Cited by 47 (4 self) - Add to MetaCart
This paper focuses on extending the power of caching and prefetching to reduce file read latencies by exploiting application level hints about future I/O accesses. We argue that systems that disclose high-level knowledge can transfer optimization information across module boundaries in a manner consistent with sound software engineering principles. Such Transparent Informed Prefetching (TIP) systems provide a technique for converting the high throughput of new technologies such as disk arrays and log-structured file systems into low latency for applications. Our preliminary experiments show that even without a highthroughput I/O subsystem TIP yields reduced execution time of up to 30% for applications obtaining data from a remote file server and up to 13% for applications obtaining data from a single local disk. These experiments indicate that greater performance benefits will be available when TIP is integrated with low level resource management policies and highly parallel I/O subsys...

Caching and writeback policies in parallel file systems

by David Kotz, Carla Schlatter Ellis - Journal of Parallel and Distributed Computing , 1993
"... Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. Such parallel disk systems require parallel le system software to avoid per ..."
Abstract - Cited by 30 (7 self) - Add to MetaCart
Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. Such parallel disk systems require parallel le system software to avoid performance-limiting bottlenecks. We discuss cache management techniques that can be used inaparallel le system implementation. We examine several writeback policies, and give results of experiments that test their performance. 1

Operating System I/O Speculation: How two invocations are faster than one

by Keir Fraser, Fay Chang, Google Inc
"... We present an in-kernel disk prefetcher which uses speculative execution to determine what data an application is likely to require in the near future. By placing our design within the operating system, we provide several bene ts compared to the previous application-level design. Not only is our sy ..."
Abstract - Cited by 22 (0 self) - Add to MetaCart
We present an in-kernel disk prefetcher which uses speculative execution to determine what data an application is likely to require in the near future. By placing our design within the operating system, we provide several bene ts compared to the previous application-level design. Not only is our system easier to implement and deploy, but by handling page faults as well as traditional le-access methods we are able to apply speculative execution to swapping applications, which often spend the majority of their execution time fetching non-resident pages. We also present two new OS features that further improve the performance of speculative execution for applications that have have large page tables and working sets. These are a fast method for synchronizing an errant speculative process with normal execution, and a modi ed form of copy-on-write which preserves application semantics without delaying normal execution. Finally, by leveraging OS knowledge about memory usage and contention, we design a mechanism for estimating and limiting the memory overhead of speculative executions.

Optimal Read-Once Parallel Disk Scheduling

by Mahesh Kallahalla, Peter J. Varman - in IOPADS , 1999
"... An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the next L references, L-block lookahead, to create a minimal-length I/O schedule. We show that the competitive ratio of L ..."
Abstract - Cited by 18 (8 self) - Add to MetaCart
An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the next L references, L-block lookahead, to create a minimal-length I/O schedule. We show that the competitive ratio of L-OPT is ( p MD=L), L M , which matches the lower bound of any prefetching algorithm with L-block lookahead. Tight bounds for the remaining ranges of lookahead are also presented. In addition we show that L-OPT is the optimal offline algorithm: when the lookahead consists of the entire reference string, it performs the absolute minimum possible number of I/Os. Finally, we show that L-OPT is comparable to the best on-line algorithm with the same amount of lookahead; the ratio of the length of its schedule to the length of the optimal schedule is always within a constant factor of the best possible. Supported in part by the National Science Foundation under grant CCR-9704562 an...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University