Results 1 - 10
of
18
Q: A Low Overhead High Performance Buffer Management Replacement Algorithm
"... In a path-breaking paper last year Pat and Betty O'Neil and Gerhard Weikum proposed a self-tuning improvement to the Least Recently Used (LRU) buffer management algorithm[15]. Their improvement is called LRU/k and advocates giving priority to buffer pages based on the kth most recent access. (The st ..."
Abstract
-
Cited by 167 (2 self)
- Add to MetaCart
In a path-breaking paper last year Pat and Betty O'Neil and Gerhard Weikum proposed a self-tuning improvement to the Least Recently Used (LRU) buffer management algorithm[15]. Their improvement is called LRU/k and advocates giving priority to buffer pages based on the kth most recent access. (The standard LRU algorithm is denoted LRU/1 according to this terminology.) If P1's kth most recent access is more more recent than P2's, then P1 will be replaced after P2. Intuitively, LRU/k for k ? 1 is a good strategy, because it gives low priority to pages that have been scanned or to pages that belong to a big randomly accessed file (e.g., the account file in TPC/A). They found that LRU/2 achieves most of the advantage of their method. The one problem of LRU/2 is the processor Supported by U.S. Office of Naval Research #N00014-91-J1472 and #N00014-92-J-1719, U.S. National Science Foundation grants #CCR-9103953 and IRI-9224601, and USRA #5555-19. Part of this work was performed while Theodo...
Buffer Management Policy for an On-Demand Video Server
- IBM Research Report, RC 19347, Yorktown Heights
"... In an on-demand video server environment, multimedia objects (e.g. movies) are very large and are read sequentially. Hence it is not economical to cache the entire object. However, caching random fractions of a multimedia object is not beneficial. This is due to the stringent response time requireme ..."
Abstract
-
Cited by 48 (5 self)
- Add to MetaCart
In an on-demand video server environment, multimedia objects (e.g. movies) are very large and are read sequentially. Hence it is not economical to cache the entire object. However, caching random fractions of a multimedia object is not beneficial. This is due to the stringent response time requirements where continuous availability of a stream has to be guaranteed; whereas caching random fractions will result in unpredictable load on the disks. Therefore, traditional buffer management policies such as LRU are not effective. In addition, the sequential access implies pages brought in by a stream can be reused by a closely following stream and subsequently discarded, thus buffering only a fraction of the entire object. In this paper, we propose a buffer management policy called the interval caching policy based on the above idea that identifies certain streams and temporarily buffers the pages brought in by those streams. We study the efficacy of this technique for reducing disk overload...
Exploiting gray-box knowledge of buffer-cache management
- In Proceedings of the USENIX Annual Technical Conference (USENIX ’02
, 2002
"... The buffer-cache replacement policy of the OS can have a significant impact on the performance of I/Ointensive applications. In this paper, we introduce a simple fingerprinting tool, Dust, which uncovers the replacement policy of the OS. Specifically, we are able to identify how initial access order ..."
Abstract
-
Cited by 35 (10 self)
- Add to MetaCart
The buffer-cache replacement policy of the OS can have a significant impact on the performance of I/Ointensive applications. In this paper, we introduce a simple fingerprinting tool, Dust, which uncovers the replacement policy of the OS. Specifically, we are able to identify how initial access order, recency of access, frequency of access, and long-term history are used to determine which blocks are replaced from the buffer cache. We show that our fingerprinting tool can identify popular replacement policies described in the literature (e.g.,
CAR: Clock with Adaptive Replacement
- IN PROCEEDINGS OF THE USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST
, 2004
"... CLOCK is a classical cache replacement policy dating back to 1968 that was proposed as a low-complexity approximation to LRU. On every cache hit, the policy LRU needs to move the accessed item to the most recently used position, at which point, to ensure consistency and correctness, it serializes c ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
CLOCK is a classical cache replacement policy dating back to 1968 that was proposed as a low-complexity approximation to LRU. On every cache hit, the policy LRU needs to move the accessed item to the most recently used position, at which point, to ensure consistency and correctness, it serializes cache hits behind a single global lock. CLOCK eliminates this lock contention, and, hence, can support high concurrency and high throughput environments such as virtual memory (for example, Multics, UNIX, BSD, AIX) and databases (for example, DB2). Unfortunately, CLOCK is still plagued by disadvantages of LRU such as disregard for "frequency", susceptibility to scans, and low performance.
As our main contribution, we propose a simple and elegant new algorithm, namely, CLOCK with Adaptive Replacement (CAR), that has several advantages over CLOCK: (i) it is scan-resistant; (ii) it is self-tuning and it adaptively and dynamically captures the "recency" and "frequency" features of a workload; (iii) it uses essentially the same primitives as CLOCK, and, hence, is low-complexity and amenable to a high-concurrency implementation; and (iv) it outperforms CLOCK across a wide-range of cache sizes and workloads. The algorithm CAR is inspired by the Adaptive Replacement Cache (ARC) algorithm, and inherits virtually all advantages of ARC including its high performance, but does not serialize cache hits behind a single global lock. As our second contribution, we introduce another novel algorithm, namely, CAR with Temporal filtering (CART), that has all the advantages of CAR, but, in addition, uses a certain temporal filter to distill pages with long-term utility from those with only short-term utility.
Characterization of Database Access Pattern for Analytic Prediction of Buffer Hit Probability
- VLDB Journal
, 1995
"... Abstract. The analytic prediction of buffer hit probability, based on the charac-terization of database accesses from real reference traces, is extremely useful for workload management and system capacity planning. The knowledge can be help-ful for proper allocation of buffer space to various databa ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
Abstract. The analytic prediction of buffer hit probability, based on the charac-terization of database accesses from real reference traces, is extremely useful for workload management and system capacity planning. The knowledge can be help-ful for proper allocation of buffer space to various database relations, as well as for the management of buffer space for a mixed transaction and query environment. Access characterization can also be used to predict the buffer invalidation effect in a multi-node environment which, in turn, can influence transaction routing strate-gies. However, it is a challenge to characterize the database access pattern of a real workload reference trace in a simple manner that can easily be used to compute buffer hit probability. In this article, we use a characterization method that distin-guishes three types of access patterns from a trace: (1) locality within a transaction, (2) random accesses by transactions, and (3) sequential accesses by long queries. We then propose a concise way to characterize the access skew across randomly accessed pages by logically grouping the large number of data pages into a small number of partitions such that the frequency of accessing each page within a par-tition can be treated as equal. Based on this approach, we present a recursive binary partitioning algorithm that can infer the access skew characterization from the buffer hit probabilities for a subset of the buffer sizes. We validate the buffer hit predictions for single and multiple node systems using production database traces. We further show that the proposed approach can predict the buffer hit probability of a composite workload from those of its component files. Key Words. Database access characterization, access skew, sequential access, ref-erence trace, workload management, analytic prediction.
I/O Reference Behavior of Production Database Workloads and the TPC Benchmarks - An Analysis at the Logical Level
- ACM Transactions on Database Systems
, 2001
"... As improvements in processor performance continue to far outpace improvements in storage performance, I /O is increasingly the bottleneck in computer systems, especially in large database systems that manage huge amounts of data. The key to achieving good I /O performance is to thoroughly understand ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
As improvements in processor performance continue to far outpace improvements in storage performance, I /O is increasingly the bottleneck in computer systems, especially in large database systems that manage huge amounts of data. The key to achieving good I /O performance is to thoroughly understand its characteristics. In this article we present a comprehensive analysis of the logical I/O reference behavior of the peak production database workloads from ten of the world’s largest corporations. In particular, we focus on how these workloads respond to different techniques for caching, prefetching, and write buffering. Our findings include several broadly applicable rules of thumb that describe how effective the various I /O optimization techniques are for the production workloads. For instance, our results indicate that the buffer pool miss ratio tends to be related to the ratio of buffer pool size to data size by an inverse square root rule. A similar fourth root rule relates the write miss ratio and the ratio of buffer pool size to data size. In addition, we characterize the reference characteristics of workloads similar to the Transaction Processing Performance Council (TPC) benchmarks C (TPC-C) and D (TPC-D), which are de facto standard performance measures for online transaction processing (OLTP) systems and decision support systems (DSS), respectively. Since benchmarks such as TPC-C and TPC-D can only be
Empirical Evaluation of Multi-level Buffer Cache Collaboration for Storage Systems
- In Proceedings of the 2005 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (SIGMETRICS ’05
, 2005
"... To bridge the increasing processor-disk performance gap, buffer caches are used in both storage clients (e.g. database systems) and storage servers to reduce the number of slow disk accesses. These buffer caches need to be managed effectively to deliver the performance commensurate to the aggregate ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
To bridge the increasing processor-disk performance gap, buffer caches are used in both storage clients (e.g. database systems) and storage servers to reduce the number of slow disk accesses. These buffer caches need to be managed effectively to deliver the performance commensurate to the aggregate buffer cache size. To address this problem, two paradigms have been proposed recently to collaboratively manage these buffer caches together: the hierarchy-aware caching maintains the same I/O interface and is fully transparent to the storage client software, and the aggressively-collaborative caching trades off transparency for performance and requires changes to both the interface and the storage client software. Before storage industry starts to implement collaborative caching in real systems, it is crucial to find out whether sacrificing transparency is really worthwhile, i.e., how much can we gain by using
Second-Level Buffer Cache Management
- IEEE Transactions on Parallel and Distributed Systems
, 2004
"... Abstract—Buffer caches are commonly used in servers to reduce the number of slow disk accesses or network messages. These buffer caches form a multilevel buffer cache hierarchy. In such a hierarchy, second-level buffer caches have different access patterns from first-level buffer caches because acce ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Abstract—Buffer caches are commonly used in servers to reduce the number of slow disk accesses or network messages. These buffer caches form a multilevel buffer cache hierarchy. In such a hierarchy, second-level buffer caches have different access patterns from first-level buffer caches because accesses to a second-level are actually misses from a first-level. Therefore, commonly used cache management algorithms such as the Least Recently Used (LRU) replacement algorithm that work well for single-level buffer caches may not work well for second-level. This paper investigates multiple approaches to effectively manage second-level buffer caches. In particular, it reports our research results in 1) second-level buffer cache access pattern characterization, 2) a new local algorithm called Multi-Queue (MQ) that performs better than nine tested alternative algorithms for second-level buffer caches, 3) a set of global algorithms that manage a multilevel buffer cache hierarchy globally and significantly improve second-level buffer cache hit ratios over corresponding local algorithms, and 4) implementation and evaluation of these algorithms in a real storage system connected with commercial database servers (Microsoft SQL Server and Oracle) running industrial-strength online transaction processing benchmarks. Index Terms—Cache memories, storage hierarchy, storage management. 1
An Analytical Model for Buffer Hit Rate Prediction
- In CASCON ’01: Proc. Conf. of the Centre for Advanced Studies on Collaborative Research. IBM
, 2001
"... Of the many tuning parameters available in a database management system (DBMS), one of the most crucial to performance is the buffer pool size. Choosing an appropriate size, however, can be a difficult task. In this paper we present an analytical modeling approach to predicting the buffer pool hit r ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Of the many tuning parameters available in a database management system (DBMS), one of the most crucial to performance is the buffer pool size. Choosing an appropriate size, however, can be a difficult task. In this paper we present an analytical modeling approach to predicting the buffer pool hit rate that can be used to simplify the process of buffer pool sizing. Since the buffer replacement algorithm determines the buffer hit rate, we model the replacement algorithm which, in the case of DB2/UDB, is a variation of the GCLOCK algorithm. A Markov Chain model of GCLOCK is used to estimate the hit rate for a buffer pool. We evaluate the accuracy of the model's estimates with experiments carried out on DB2/UDB with the TPC-C benchmark. The model is validated for both single and multiple buffer pool cases.

