Results 1 - 10
of
28
Informed Prefetching and Caching
- In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles
, 1995
"... The underutilization of disk parallelism and file cache buffers by traditional file systems induces I/O stall time that degrades the performance of modern microprocessor-based systems. In this paper, we present aggressive mechanisms that tailor file system resource management to the needs of I/O-int ..."
Abstract
-
Cited by 321 (8 self)
- Add to MetaCart
The underutilization of disk parallelism and file cache buffers by traditional file systems induces I/O stall time that degrades the performance of modern microprocessor-based systems. In this paper, we present aggressive mechanisms that tailor file system resource management to the needs of I/O-intensive applications. In particular, we show how to use application-disclosed access patterns (hints) to expose and exploit I/O parallelism and to allocate dynamically file buffers among three competing demands: prefetching hinted blocks, caching hinted blocks for reuse, and caching recently used data for unhinted accesses. Our approach estimates the impact of alternative buffer allocations on application execution time and applies a cost-benefit analysis to allocate buffers where they will have the greatest impact. We implemented informed prefetching and caching in DEC’s OSF/1 operating system and measured its performance on a 150 MHz Alpha equipped with 15 disks running a range of applications including text search, 3D scientific visualization, relational database queries, speech recognition, and computational chemistry. Informed prefetching reduces the execution time of the first four of these applications by 20 % to 87%. Informed caching reduces the execution time of the fifth application by up to 30%.
Automatic Compiler-Inserted I/O Prefetching for Out-of-Core Applications
, 1996
"... Current operating systems offer poor performance when a numeric application's working set does not fit in main memory. As a result, programmers who wish to solve "out-of-core" problems efficiently are typically faced with the onerous task of rewriting an application to use explicit I/O operations (e ..."
Abstract
-
Cited by 138 (6 self)
- Add to MetaCart
Current operating systems offer poor performance when a numeric application's working set does not fit in main memory. As a result, programmers who wish to solve "out-of-core" problems efficiently are typically faced with the onerous task of rewriting an application to use explicit I/O operations (e.g., read/write). In this paper, we propose and evaluate a fully-automatic technique which liberates the programmer from this task, provides high performance, and requires only minimal changes to current operating systems. In our scheme, the compiler provides the crucial information on future access patterns without burdening the programmer, the operating system supports non-binding prefetch and re- lease hints for managing I/O, and the operating sys- tem cooperates with a run-time layer to accelerate performance by adapting to dynamic behavior and minimizing prefetch overhead. This approach maintains the abstraction of unlimited virtual memory for the programmer, gives the compiler the flexibility to aggressively move prefetches back ahead of references, and gives the operating system the flexibility to arbitrate between the competing resource demands of multiple applications. We have implemented our scheme using the SUIF compiler and the Hurricane operating system. Our experimental results demonstrate that our fully-automatic scheme effectively hides the I/O latency in out-of- core versions of the entire NAS Parallel benchmark suite, thus resulting in speedups of roughly twofold for five of the eight applications, with one application speeding up by over threefold.
Informed Multi-Process Prefetching and Caching
- In Proceedings of the 1997 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems
, 1997
"... Informed prefetching and caching based on application disclosure of future I/O accesses (hints) can dramatically reduce the execution time of I/O-intensive applications. A recent study showed that, in the context of a single hinting application, prefetching and caching algorithms should adapt to the ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
Informed prefetching and caching based on application disclosure of future I/O accesses (hints) can dramatically reduce the execution time of I/O-intensive applications. A recent study showed that, in the context of a single hinting application, prefetching and caching algorithms should adapt to the dynamic load on the disks to obtain the best performance. In this paper, we show how to incorporate adaptivity to disk load into the TIP2 system, which uses cost-benefit analysis to allocate global resources among multiple processes. We compare the resulting system, which we call TIPTOE (TIP with Temporal Overload Estimators) to Cao et al's LRU-SP allocation scheme, also modified to include adaptive prefetching. Using disk-accurate trace-driven simulation we show that, averaged over eleven experiments involving pairs of hinting applications, and with data striped over one to ten disks, TIPTOE delivers 7% lower execution time than LRU-SP. Where the computation and I/O demands of each experi...
Multiprocessor file system interfaces
- In Proceedings of the Second International Conference on Parallel and Distributed Information Systems
, 1993
"... Increasingly, le systems for multiprocessors are designed with parallel access to multiple disks, to keep I/O from becoming a serious bottleneck for parallel applications. Although le system software can transparently provide high-performance access to parallel disks, a new le system interface is ne ..."
Abstract
-
Cited by 50 (5 self)
- Add to MetaCart
Increasingly, le systems for multiprocessors are designed with parallel access to multiple disks, to keep I/O from becoming a serious bottleneck for parallel applications. Although le system software can transparently provide high-performance access to parallel disks, a new le system interface is needed to facilitate parallel access to a le from a parallel application. We describe the di culties faced when using the conventional (Unix-like) interface in parallel applications, and then outline ways to extend the conventional interface to provide convenient access to the le for parallel programs, while retaining the traditional interface for programs that have no need for explicitly parallel le access. Our interface includes a single naming scheme, a multiopen operation, local and global le pointers, mapped le pointers, logical records, multi les, and logical coercion for backward compatibility. 1
Hfs: A performance-oriented flexible file system based on building-block compositions
- ACM Transactions on Computer Systems
, 1997
"... The Hurricane File System (HFS) is designed for (potentially large-scale) shared-memory multiprocessors. Its architecture is based on the principle that, in order to maximize performance for applications with diverse requirements, a file system must support a wide variety of file structures, file sy ..."
Abstract
-
Cited by 49 (8 self)
- Add to MetaCart
The Hurricane File System (HFS) is designed for (potentially large-scale) shared-memory multiprocessors. Its architecture is based on the principle that, in order to maximize performance for applications with diverse requirements, a file system must support a wide variety of file structures, file system policies, and I/O interfaces. Files in HFS are implemented using simple building blocks composed in potentially complex ways. This approach yields great flexibility, allowing an application to customize the structure and policies of a file to exactly meet its requirements. As an extreme example, HFS allows a file’s structure to be optimized for concurrent random-access write-only operations by 10 threads, something no other file system can do. Similarly, the prefetching, locking, and file cache management policies can all be chosen to match an application’s access pattern. In contrast, most parallel file systems support a single file structure and a small set of policies. We have implemented HFS as part of the Hurricane operating system running on the Hector shared-memory multiprocessor. We demonstrate that the flexibility of HFS comes with little processing or I/O overhead. We also show that for a number of file access patterns, HFS is able to deliver to the applications the full I/O bandwidth of the disks on our system.
Integrating Theory and Practice in Parallel File Systems
- PROCEEDINGS OF THE 1993 DAGS/PC SYMPOSIUM (THE DARTMOUTH INSTITUTE FOR ADVANCED GRADUATE STUDIES
, 1993
"... Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present for the first time a list of capabilities that mus ..."
Abstract
-
Cited by 48 (11 self)
- Add to MetaCart
Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present for the first time a list of capabilities that must be provided by the system to support these optimal algorithms: control over declustering, querying about the configuration, independent I/O, and turning off parity, file caching, and prefetching. We summarize recent theoretical and empirical work that justifies the need for these capabilities. In addition, we sketch an organization for a parallel file interface with low-level primitives and higher-level operations.
Input/Output Access Pattern Classification Using Hidden Markov Models
- In Proceedings of the Fifth Workshop on Input/Output in Parallel and Distributed Systems
, 1997
"... Input/output performance on current parallel file systems is sensitive to a good match of application access pattern to file system capabilities. Automatic input/output access classification can determine application access patterns at execution time, guiding adaptive file system policies. In this p ..."
Abstract
-
Cited by 48 (4 self)
- Add to MetaCart
Input/output performance on current parallel file systems is sensitive to a good match of application access pattern to file system capabilities. Automatic input/output access classification can determine application access patterns at execution time, guiding adaptive file system policies. In this paper we examine a new method for access pattern classification that uses hidden Markov models, trained on access patterns from previous executions, to create a probabilistic model of input/output accesses. We compare this approach to a neural network classification framework, presenting performance results from parallel and sequential benchmarks and applications. 1 Introduction Input/output is a critical bottleneck for many important scientific applications. One reason is that performance of extant parallel file systems is particularly sensitive to file access patterns. Often the application programmer must match application input/output requirements to the capabilities of the file system....
A Status Report on Research in Transparent Informed Prefetching
- ACM Operating Systems Review
, 1993
"... This paper focuses on extending the power of caching and prefetching to reduce file read latencies by exploiting application level hints about future I/O accesses. We argue that systems that disclose high-level knowledge can transfer optimization information across module boundaries in a manner cons ..."
Abstract
-
Cited by 47 (4 self)
- Add to MetaCart
This paper focuses on extending the power of caching and prefetching to reduce file read latencies by exploiting application level hints about future I/O accesses. We argue that systems that disclose high-level knowledge can transfer optimization information across module boundaries in a manner consistent with sound software engineering principles. Such Transparent Informed Prefetching (TIP) systems provide a technique for converting the high throughput of new technologies such as disk arrays and log-structured file systems into low latency for applications. Our preliminary experiments show that even without a highthroughput I/O subsystem TIP yields reduced execution time of up to 30% for applications obtaining data from a remote file server and up to 13% for applications obtaining data from a single local disk. These experiments indicate that greater performance benefits will be available when TIP is integrated with low level resource management policies and highly parallel I/O subsys...
Intelligent, Adaptive File System Policy Selection
- In Proceedings of the Sixth Symposium on the Frontiers of Massively Parallel Computation
, 1996
"... Traditionally, maximizing input/output performance has required tailoring application input /output patterns to the idiosyncrasies of specific input/output systems. In this paper, we show that one can achieve high application input/output performance via a low overhead input /output system that auto ..."
Abstract
-
Cited by 36 (7 self)
- Add to MetaCart
Traditionally, maximizing input/output performance has required tailoring application input /output patterns to the idiosyncrasies of specific input/output systems. In this paper, we show that one can achieve high application input/output performance via a low overhead input /output system that automatically recognizes file access patterns and adaptively modifies system policies to match application input/output needs. This approach reduces the application developer's input/output optimization effort by isolating input/output optimization decisions within a retargetable file system infrastructure. To validate these claims, we have built a lightweight file system policy testbed that uses a trained learning mechanism to recognize access patterns. The file system then uses these access pattern classifications to select appropriate caching strategies, dynamically adapting file system policies to changing input/output demands throughout application execution. Our experimental data show dram...
Expanding the potential for disk-directed I/O
- In Proceedings of the 1995 IEEE Symposium on Parallel and Distributed Processing
, 1995
"... As parallel computers are increasingly used to run scienti c applications with large data sets, and as processor speeds continue to increase, it becomes more important to provide fast, e ective parallel le systems for data storage and for temporary les. In an earlier work we demonstrated that a tech ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
As parallel computers are increasingly used to run scienti c applications with large data sets, and as processor speeds continue to increase, it becomes more important to provide fast, e ective parallel le systems for data storage and for temporary les. In an earlier work we demonstrated that a technique we call disk-directed I/O has the potential to provide consistent high performance for large, collective, structured I/O requests. In this paper we expand on this potential by demonstrating the ability of a disk-directed I/O system to read irregular subsets of data from a le, and to lter and distribute incoming data according to data-dependent functions. 1

