Results 1 - 10
of
31
Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O
, 2001
"... Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as son as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level goals of performance and quality of service. U ..."
Abstract
-
Cited by 94 (2 self)
- Add to MetaCart
Disk schedulers in current operating systems are generally work-conserving, i.e., they schedule a request as son as the previous request has finished. Such schedulers often require multiple outstanding requests from each process to meet system-level goals of performance and quality of service. Unfortunately, many common applications issue disk read requests in a synchronous manna% interspersing successive requests with shor periods of computation. The scheduler chooses the next request too early; this induces deceptive idleness, a condition where the scheduler incorrectly assumes that the test request issuing process has no further requests, and becomes forced to switch to a toques? from another pro- Ce3S.
File Access Prediction with Adjustable Accuracy
, 2002
"... We describe a novel on-line file access predictor, Recent Popularity, capable of rapid adaptation to workload changes while simultaneously predicting more events with greater accuracy than prior efforts. We distinguish the goal of predicting the most events accurately from the goal of offering the ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
We describe a novel on-line file access predictor, Recent Popularity, capable of rapid adaptation to workload changes while simultaneously predicting more events with greater accuracy than prior efforts. We distinguish the goal of predicting the most events accurately from the goal of offering the most accurate predictions (when declining to offer a prediction is acceptable). For this purpose we present two distinct measures of accuracy, general and specific accuracy, corresponding to these goals. We describe how our new predictor and an earlier effort, Noah, can trade the number of events predicted for prediction accuracy by modifying simple parameters. When prediction accuracy is strictly more important than the number of predictions offered, trace-based evaluation demonstrates error rates as low as 2%, while offering predictions for more than 60% of all file access events.
NFS Tricks and Benchmarking Traps
, 2003
"... We describe two modifications to the FreeBSD 4.6 NFS server to increase read throughput by improving the read-ahead heuristic to deal with reordered requests and stride access patterns. We show that for some stride access patterns, our new heuristics improve end-to-end NFS throughput by nearly a fac ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
We describe two modifications to the FreeBSD 4.6 NFS server to increase read throughput by improving the read-ahead heuristic to deal with reordered requests and stride access patterns. We show that for some stride access patterns, our new heuristics improve end-to-end NFS throughput by nearly a factor of two. We also show that benchmarking and experimenting with changes to an NFS server can be a subtle and challenging task, and that it is often difficult to distinguish the impact of a new algorithm or heuristic from the quirks of the underlying software and hardware with which they interact. We discuss these quirks and their potential effects.
Performing File Prediction with a Program-Based Successor Model
- IN PROCEEDINGS OF THE 9TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS ’01
, 2001
"... Recent increases in CPU performance have surpassed those in hard drives. As a result, disk operations have become more expensive in terms of the number of CPU cycles spent waiting for them to complete. File prediction can mitigate this problem by prefetching files into cache before they are accessed ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
Recent increases in CPU performance have surpassed those in hard drives. As a result, disk operations have become more expensive in terms of the number of CPU cycles spent waiting for them to complete. File prediction can mitigate this problem by prefetching files into cache before they are accessed. Identifying relationships between individual files plays a key role in successfully performing file prefetching. It is well-known that previous patterns of file references can be used to predict future references. Nevertheless, knowledge about the programs producing the relationships between individual files has rarely been investigated. We present a Program-Based Successor (PBS) model that identifies relationships between files through the names of the programs accessing them. We develop a Program-based Last Successor (PLS) model derived from PBS to do file prediction. Our simulation results show that PLS makes 21% fewer incorrect predictions and roughly the same number of correct predictions as the Last-Successor (LS) model. We also examine the cache hit ratio achieved by applying PLS to the Least Recently Used (LRU) caching algorithm and show that a cache using PLS and LRU together can perform better than a cache up to 40 times larger using LRU alone. Finally, we argue that because program-based successors are more likely to be used soon, incorrectly prefetched program-based successors are more likely to be used and thus less incorrect than incorrectly prefetched files from non-program-based models.
I/O System Performance Debugging Using Model-driven Anomaly Characterization
- USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES
, 2005
"... It is challenging to identify performance problems andpinpoint their root causes in complex systems, especially when the system supports wide ranges of workloads and when performance problems only materialize under particular workload conditions. This paper proposes a model-driven anomaly characteri ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
It is challenging to identify performance problems andpinpoint their root causes in complex systems, especially when the system supports wide ranges of workloads and when performance problems only materialize under particular workload conditions. This paper proposes a model-driven anomaly characterization approach and uses it to discover operating system performance bugs when supporting disk I/O-intensive online servers. We construct a whole-system I/O throughput model as the reference of expected performance and we use statistical clustering and characterization of performance anomalies to guide debugging. Unlike previous performance debugging methods offering detailed statistics at specific execution settings, our approach focuses on comprehensive anomaly characterization over wide ranges of work-load conditions and system configurations. Our approach helps us quickly identify four performance bugs in the I/O system of the recent Linux 2.6.10 kernel (one in the file system prefetching, two in the anticipatory I/O scheduler, and one in the elevator I/O scheduler). Our experiments with two Web server benchmarks, a trace-driven index searching server, and the TPC-C database benchmark show that the corrected ker-nel improves system throughput by up to five-fold compared with the original kernel (averaging 6%, 32%, 39%,and 16 % for the four server workloads).
Improving Disk Throughput in Data-Intensive Servers
, 2004
"... Low disk throughput is one of the main impediments to improving the performance of data-intensive servers. In this paper, we propose two management techniques for the disk controller cache that can significantly increase disk throughput. The first technique, called File-Oriented Read-ahead (FOR), ad ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Low disk throughput is one of the main impediments to improving the performance of data-intensive servers. In this paper, we propose two management techniques for the disk controller cache that can significantly increase disk throughput. The first technique, called File-Oriented Read-ahead (FOR), adjusts the number of read-ahead blocks brought into the disk controller cache according to file system information. The second technique, called Host-guided Device Caching (HDC), gives the host control over part of the disk controller cache. As an example use of this mechanism, we keep the blocks that cause the most misses in the host buffer cache permanently cached in the disk controller. Our detailed simulations of real server workloads show that FOR and HDC can increase disk throughput by up to 34 % and 24%, respectively, in comparison to conventional disk controller cache management techniques. When combined, the techniques can increase throughput by up to 47%.
Using Multiple Predictors to Improve the Accuracy of File Access Predictions
- In Proceedings of the 20th IEEE / 11th NASA Goddard Conference on Mass Storage Systems and Technologies
, 2003
"... Existing file access predictors keep track of previous file access patterns and rely on a single heuristic to predict which of the previous successors to the file being currently accessed is the most likely to be accessed next. We present here a novel composite predictor that applies multiple heuris ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Existing file access predictors keep track of previous file access patterns and rely on a single heuristic to predict which of the previous successors to the file being currently accessed is the most likely to be accessed next. We present here a novel composite predictor that applies multiple heuristics to this selection problem. As a result, it can make use of specialized heuristics that can make very accurate predictions when access patterns are observed to meet their particular criteria. Simulation results involving a total of seven file access traces indicate that our predictor delivers more correct predictions and less inaccurate guesses than predictors relying on a single heuristic for selecting a successor. 1.
Workload-Specific File System Benchmarks
, 2001
"... To Maddie, who didn’t understand why Daddy had to work late And to Jackie, who did A fundamental problem with the current generation of file system benchmarks is that they fail to take into account the fact that a file system’s performance can vary depending on the workload running on it. Many bench ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
To Maddie, who didn’t understand why Daddy had to work late And to Jackie, who did A fundamental problem with the current generation of file system benchmarks is that they fail to take into account the fact that a file system’s performance can vary depending on the workload running on it. Many benchmarks attempt to reduce file system perfor-mance to a single number, producing a simplistic one-dimensional ordering of the sys-tems being tested. Although this may be useful for marketing literature, the performance of file systems in the real world is more complicated. Different workloads place different demands on the file system, and can result in different behavior from the underlying sys-tem. A file system that provides superior performance for a web server may have inferior performance when running a software development workload. In this dissertation I demonstrate that the “one size fits all ” approach of current file system benchmarks does not accurately predict the performance of different workloads on different file systems. I then present a new benchmarking methodology
The Design and Evaluation of Web Prefetching and Caching Techniques
, 2002
"... User-perceived retrieval latencies in the World Wide Web can be improved by pre-loading a local cache with resources likely to be accessed. A user requesting content that can be served by the cache is able to avoid the delays inherent in the Web, such as congested networks and slow servers. The diff ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
User-perceived retrieval latencies in the World Wide Web can be improved by pre-loading a local cache with resources likely to be accessed. A user requesting content that can be served by the cache is able to avoid the delays inherent in the Web, such as congested networks and slow servers. The difficulty, then, is to determine what content to prefetch into the cache. This work explores machine learning algorithms for user sequence prediction, both in general and specifically for sequences of Web requests. We also consider information retrieval techniques to allow the use of the content of Web pages to help predict future requests. Although history-based mechanisms can provide strong performance in predicting future requests, performance can be improved by including predictions from additional sources. While past researchers have used a variety of techniques for evaluating caching algorithms and systems, most of those methods were not applicable to the evaluation of prefetching algorithms or systems. Therefore, two new mechanisms for evaluation are introduced. The first is a detailed trace-based simulator, built from scratch,
Caching Files with a Program-based Last n Successors Model
- In Proceedings of the Workshop on Caching, Coherency and Consistency (WC3
"... Recent increases in CPU performance have outpaced increases in hard drive performance. As a result, disk operations have become more expensive in terms of CPU cycles spent waiting for disk operations to complete. File prediction can mitigate this problem by prefetching files into cache before they a ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Recent increases in CPU performance have outpaced increases in hard drive performance. As a result, disk operations have become more expensive in terms of CPU cycles spent waiting for disk operations to complete. File prediction can mitigate this problem by prefetching files into cache before they are accessed. However, incorrect prediction is to a certain degree both unavoidable and costly. We present the Program-based Last N Successors (PLNS) file prediction model that identifies relationships between files through the names of the programs accessing them. Our simulation results show that PLNS makes at least 21.11 % fewer incorrect predictions and roughly the same number of correct predictions as the last-successor model. We also examine the cache hit ratio of applying PLNS to the Least Recently Used (LRU) caching algorithm and show that a cache using PLNS and LRU together can perform as well as a cache up to 40 times larger using LRU alone. 1

