Results 1 - 10
of
38
Integrated Parallel Prefetching and Caching
, 1995
"... Recently there has been a great deal of interest in prefetching from parallel disks, as a technique for enabling serial applications to improve I/O performance. Studies have also shown that for optimal performance, it is important to properly integrate prefetching and caching. In this paper, we stud ..."
Abstract
-
Cited by 63 (5 self)
- Add to MetaCart
Recently there has been a great deal of interest in prefetching from parallel disks, as a technique for enabling serial applications to improve I/O performance. Studies have also shown that for optimal performance, it is important to properly integrate prefetching and caching. In this paper, we study integrated prefetching and caching strategies for multiple disks. We present two algorithms, regular aggressive and reverse aggressive, and show that reverse aggressive is close to optimal. Using trace-driven simulation on a collection of file access traces, we evaluated these algorithms under a variety of data placement alternatives. Our results show that both algorithms can achieve near linear speedup when the load is distributed evenly on the disks, and reverse aggressive performs well even when the placement of blocks on disks distributes the load unevenly. Our simulations also show that, for file system traces, replicating data, even across all of the disks, offers little performance ...
Energy Efficient Prefetching and Caching
- IN PROCEEDINGS OF THE USENIX ANNUAL TECHNICAL CONFERENCE
, 2004
"... Traditional disk management strategies---prefetching and caching in particular---are designed to maximize performance. In mobile systems they conflict with strategies that attempt to save energy by powering down the disk when it is idle. We present new rules for prefetching and caching that maximize ..."
Abstract
-
Cited by 59 (5 self)
- Add to MetaCart
Traditional disk management strategies---prefetching and caching in particular---are designed to maximize performance. In mobile systems they conflict with strategies that attempt to save energy by powering down the disk when it is idle. We present new rules for prefetching and caching that maximize power-down opportunities (without performance loss) by creating an access pattern characterized by intense bursts of activity separated by long idle times. We also describe an automatic system that monitors past application behavior in order to generate appropriate prefetching hints, and a general system of kernel enhancements that coordinate I/O activity across all running applications. We have
Implementing Cooperative Prefetching and Caching in a Globally-Managed Memory System
- In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems
, 1998
"... This paper presents cooperative prefetching and caching --- the use of network-wide global resources (memories, CPUs, and disks) to support prefetching and caching in the presence of hints of future demands. Cooperative prefetching and caching effectively unites disk-latency reduction techniques fro ..."
Abstract
-
Cited by 56 (11 self)
- Add to MetaCart
This paper presents cooperative prefetching and caching --- the use of network-wide global resources (memories, CPUs, and disks) to support prefetching and caching in the presence of hints of future demands. Cooperative prefetching and caching effectively unites disk-latency reduction techniques from three lines of research: prefetching algorithms, cluster-wide memory management, and parallel I/O. When used together, these techniques greatly increase the power of prefetching relative to a conventional (nonglobal -memory) system. We have designed and implemented PGMS, a cooperative prefetching and caching system, under the Digital Unix operating system running on a 1.28 Gb/sec Myrinetconnected cluster of DEC Alpha workstations. Our measurements and analysis show that by using available global resources, cooperative prefetching can obtain significant speedups for I/O-bound programs. For example, for a graphics rendering application, our system achieves a speedup of 4.9 over a non-prefetc...
Input/Output Access Pattern Classification Using Hidden Markov Models
- In Proceedings of the Fifth Workshop on Input/Output in Parallel and Distributed Systems
, 1997
"... Input/output performance on current parallel file systems is sensitive to a good match of application access pattern to file system capabilities. Automatic input/output access classification can determine application access patterns at execution time, guiding adaptive file system policies. In this p ..."
Abstract
-
Cited by 48 (4 self)
- Add to MetaCart
Input/output performance on current parallel file systems is sensitive to a good match of application access pattern to file system capabilities. Automatic input/output access classification can determine application access patterns at execution time, guiding adaptive file system policies. In this paper we examine a new method for access pattern classification that uses hidden Markov models, trained on access patterns from previous executions, to create a probabilistic model of input/output accesses. We compare this approach to a neural network classification framework, presenting performance results from parallel and sequential benchmarks and applications. 1 Introduction Input/output is a critical bottleneck for many important scientific applications. One reason is that performance of extant parallel file systems is particularly sensitive to file access patterns. Often the application programmer must match application input/output requirements to the capabilities of the file system....
Energy Efficiency Through Burstiness
- In Proceedings of the 5th IEEE Workshop on Mobile Computing Systems and Applications
, 2003
"... OS resource management policies traditionally employ buffering to “smooth out ” fluctuations in resource demand. By minimizing the length of idle periods and the level of contention during non-idle periods, such smoothing tends to maximize overall throughput and minimize the latency of individual re ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
OS resource management policies traditionally employ buffering to “smooth out ” fluctuations in resource demand. By minimizing the length of idle periods and the level of contention during non-idle periods, such smoothing tends to maximize overall throughput and minimize the latency of individual requests. For certain important devices, however (disks, network interfaces, or even computational elements), smoothing eliminates opportunities to save energy using low-power modes. As devices with such modes proliferate, and as energy efficiency becomes an increasingly important design consideration, we argue that OS policies should be redesigned to increase burstiness for energysensitive devices. We are currently experimenting with techniques to increase the disk access pattern burstiness of the Linux operating system. Our results indicate that the deliberate creation of bursty activity can save up to 78.5 % of the energy consumed by a Hitachi DK23DA disk (in comparison with current policies), while simultaneously decreasing the negative impact of disk congestion and spin-up latency on application performance. 1.
C-Miner: Mining Block Correlations in Storage Systems
- In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST ’04
, 2004
"... systems. These correlations can be exploited for improving the effectiveness of storage caching, prefetching, data layout and disk scheduling. Unfortunately, information about block correlations is not available at the storage system level. Previous approaches for discovering file correlations in fi ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
systems. These correlations can be exploited for improving the effectiveness of storage caching, prefetching, data layout and disk scheduling. Unfortunately, information about block correlations is not available at the storage system level. Previous approaches for discovering file correlations in file systems do not scale well enough to be used for discovering block correlations in storage systems.
Storage-Aware Caching: Revisiting Caching For Heterogeneous Storage Systems
, 2002
"... Modern storage environments are composed of a variety of devices with different performance characteristics. In this paper, we explore storage-aware caching algorithms, in which the file buffer replacement algorithm explicitly accounts for differences in performance across devices. We introduce a ne ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Modern storage environments are composed of a variety of devices with different performance characteristics. In this paper, we explore storage-aware caching algorithms, in which the file buffer replacement algorithm explicitly accounts for differences in performance across devices. We introduce a new family of storage-aware caching algorithms that partition the cache, with one partition per device. The algorithms set the partition sizes dynamically to balance work across the devices. Through simulation, we show that our storage-aware policies perform similarly to LANDLORD, a cost-aware algorithm previously shown to perform well in Web caching environments. We also demonstrate that partitions can be easily incorporated into the Clock replacement algorithm, thus increasing the likelihood of deploying cost-aware algorithms in modern operating systems.
The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms
- In SIGMETRICS
, 2005
"... Abstract—A fundamental challenge in improving file system performance is to design effective block replacement algorithms to minimize buffer cache misses. Despite the well-known interactions between prefetching and caching, almost all buffer cache replacement algorithms have been proposed and studie ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
Abstract—A fundamental challenge in improving file system performance is to design effective block replacement algorithms to minimize buffer cache misses. Despite the well-known interactions between prefetching and caching, almost all buffer cache replacement algorithms have been proposed and studied comparatively, without taking into account file system prefetching, which exists in all modern operating systems. This paper shows that such kernel prefetching can have a significant impact on the relative performance in terms of the number of actual disk I/Os of many well-known replacement algorithms; it can not only narrow the performance gap but also change the relative performance benefits of different algorithms. Moreover, since prefetching can increase the number of blocks clustered for each disk I/O and, hence, the time to complete the I/O, the reduction in the number of disk I/Os may not translate into proportional reduction in the total I/O time. These results demonstrate the importance of buffer caching research taking file system prefetching into consideration and comparing the actual disk I/Os and the execution time under different replacement algorithms. Index Terms—Metrics/measurement, operating systems, file systems management, operating systems performance, measurements, simulation. 1
Increasing Disk Burstiness for Energy Efficiency
, 2002
"... Hard disks for portable devices, and the operating systems that manage them, incorporate spin-down policies that idle the disk after a certain period of inactivity. In essence, these policies use a recent period of inactivity to predict that the disk will remain inactive in the near future. We propo ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Hard disks for portable devices, and the operating systems that manage them, incorporate spin-down policies that idle the disk after a certain period of inactivity. In essence, these policies use a recent period of inactivity to predict that the disk will remain inactive in the near future. We propose an alternative strategy, in which the operating system deliberately seeks to cluster disk operations in time, to maximize the utilization of the disk when it is spun up and the time that the disk can be spun down. In order to cluster disk operations we postpone the service of non-urgent operations, and use aggressive prefetching and file prediction to reduce the likelihood that synchronous reads will have to go to disk. In addition, we present a novel predictive spin-down/spin-up policy that exploits high level operating system knowledge to decrease disk idle time prior to spin-down, and application wait time due to spin-up. We evaluate our strategy through trace-driven simulation of several different workload scenarios. Our results indicate that the deliberate creation of bursty activity can save up to 55% of the energy consumed by an IBM TravelStar disk, while simultaneously decreasing significantly the negative impact of disk spin-up latency on application performance.
A Cost-Benefit Scheme for High Performance Predictive Prefetching
- In Proceedings of SC99: High Performance Networking and Computing
, 1999
"... High-performance computing systems will increasingly rely on prefetching data from disk to overcome long disk access times and maintain high utilization of parallel I/O systems. This paper evaluates a prefetching technique that chooses which blocks to prefetch based on their probability of access an ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
High-performance computing systems will increasingly rely on prefetching data from disk to overcome long disk access times and maintain high utilization of parallel I/O systems. This paper evaluates a prefetching technique that chooses which blocks to prefetch based on their probability of access and decides whether to prefetch a particular block at a given time using a cost-benefit analysis. The algorithm uses a probability tree to record past accesses and to predict future access patterns. We simulate this prefetching algorithm with a variety of I/O traces. We show that our predictive prefetching scheme combined with simple oneblock -lookahead prefetching produces good performance for a variety of workloads. The scheme reduces file cache miss rates by up to 36% for workloads that receive no benefit from sequential prefetching. We show that the memory requirements for building the probability tree are reasonable, requiring about a megabyte for good performance. The probabilit...

