Results 11 - 20
of
22
STOW: A Spatially and Temporally Optimized Write Caching Algorithm
"... Non-volatile write-back caches enable storage controllers to provide quick write response times by hiding the latency of the disks. Managing a write cache well is critical to the performance of storage controllers. Over two decades, various algorithms have been proposed, including the most popular, ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Non-volatile write-back caches enable storage controllers to provide quick write response times by hiding the latency of the disks. Managing a write cache well is critical to the performance of storage controllers. Over two decades, various algorithms have been proposed, including the most popular, LRW, CSCAN, and WOW. While LRW leverages temporal locality in the workload, and CSCAN creates spatial locality in the destages, WOW combines the benefits of both temporal and spatial localities in a unified ordering for destages. However, there remains an equally important aspect of write caching to be considered, namely, the rate of destages. For the best performance, it is important to destage at a steady rate while making sure that the write cache is not under-utilized or over-committed. Most algorithms have not seriously considered this problem, and as a consequence, forgo a significant portion of the performance gains that can be achieved. We propose a simple and adaptive algorithm, STOW, which not only exploits both spatial and temporal localities in a new order of destages, but also facilitates and controls the rate of destages effectively. Further, STOW partitions the write cache into a sequential queue and a random queue, and dynamically and continuously adapts their relative sizes. Treating the two kinds of writes separately provides for better destage rate control, resistance to one-time sequential requests polluting the cache, and a workload-responsive write caching policy. STOW represents a leap ahead of all previously proposed write cache management algorithms. As anecdotal evidence, with a write cache of 32K pages, serving a 4+P RAID-5 array, using an SPC-1 Like Benchmark, STOW
WSCLoca- A Simple and Effective Algorithm for Virtual Memory Management
"... A new virtual memory management algorithm WSCLOCK has been synthesized from the local working set (WS) algorithm, the global CLOCK algorithm, and a new load control mechanism for auxiliary memory access. The new algorithm combines the most useful feature of WS-a natural and efti:ctive load control t ..."
Abstract
- Add to MetaCart
A new virtual memory management algorithm WSCLOCK has been synthesized from the local working set (WS) algorithm, the global CLOCK algorithm, and a new load control mechanism for auxiliary memory access. The new algorithm combines the most useful feature of WS-a natural and efti:ctive load control that prevents thrashing-with the simplicity and efficiency of CLOCK. Studies are presented to show that the performance of WS and WSCLOCK are equivalent, even if the savings in overhead are ignored.
Flexible Physical Memory Management
, 1995
"... This paper presents a variety of memory management systems, ranging from virtual memory to file systems and databases. The diverse needs of a broad range of clients have caused each of these systems to provide some kind of flexible memory management. These systems are described and examined for c ..."
Abstract
- Add to MetaCart
This paper presents a variety of memory management systems, ranging from virtual memory to file systems and databases. The diverse needs of a broad range of clients have caused each of these systems to provide some kind of flexible memory management. These systems are described and examined for common facilities and requirements. Finally, an analysis of the systems' similarities leads to a discussion of the requirements of a common denominator for physical memory management: flexible virtual memory that can be tailored to best suit the needs of various systems. 1 Introduction This paper describes the facilities and requirements of a variety of memory management systems, ranging from virtual memory to file systems and databases. The diverse needs of a broad range of clients have caused each of these systems to provide some kind of flexible memory management. The rest of this section describes the task of memory management and outlines various motivations for flexibility. 1.1 Ph...
High-Performance Adaptive Routing in Multicomputers Using Dynamic Virtual Circuits +
- 6th Distributed Memory Computing Conference
, 1991
"... A message transport mechanism which provides highbandwidth low-latency interprocessor communication is the key to the ability of multicomputers to achieve high performance. The system should adapt to changing conditions by routing packets around congested areas and failed links or nodes. We introduc ..."
Abstract
- Add to MetaCart
A message transport mechanism which provides highbandwidth low-latency interprocessor communication is the key to the ability of multicomputers to achieve high performance. The system should adapt to changing conditions by routing packets around congested areas and failed links or nodes. We introduce a new message transport mechanism, called Dynamic Virtual Circuits, that combines the best features of circuit switching, packet switching, and static virtual circuits. Routing through intermediate nodes usually requires only a single lookup in a small table, packets include minimal control information, and are delivered in FIFO order. Nodes in the middle of a Dynamic Virtual Circuit can break it and later reestablish it through a different physical path, thus supporting adaptive routing while maintaining the semantics of virtual circuits. We present the basic algorithms for Dynamic Virtual Circuits and the required hardware support in the context of a VLSI communication coprocessor for multicomputers.
A Multi-Process Design Of A Paging System
"... This report is a minor revision of a theis of the same title submitted to the Department of Electrical Engineerin 8 and Computer Science, Massachusetts Institute of Technology, on May 19, 1976 in partial fulfillment of the requirements for the degrees of Master of Science and Electrical Engineer. I ..."
Abstract
- Add to MetaCart
This report is a minor revision of a theis of the same title submitted to the Department of Electrical Engineerin 8 and Computer Science, Massachusetts Institute of Technology, on May 19, 1976 in partial fulfillment of the requirements for the degrees of Master of Science and Electrical Engineer. I wish to thank my advisor, Dave Clark, for his patience in what has been a rather protracted effort. The original idea for this thesis is due to him. Three people were of great help to me in implementing the design presented in this thesis: Bernie Greenberg explained many of the mysteries of Multics page control and gladly contributed his time, knowledge and enthusiasm. Bob Mabee implemented some of the code necessary to permit page control to be implemented on Multics as parallel processes, and helped in getting the design working on Multics. Doug Wells was expert at finding my programming errors and explaining the pitfalls of PL/1. Without their help, I would still be debugging. Many other members of the Computer System Research Division contributed in ways too numerous to mention
IEEE 4 Computer
- Computer
, 2004
"... overhead algorithm that responds online to changing access patterns. ARC continually balances between the recency and frequency features of the workload, demonstrating that adaptation eliminates the need for the workload -specific pretuning that plagued many previous proposals to improve LRU. ..."
Abstract
- Add to MetaCart
overhead algorithm that responds online to changing access patterns. ARC continually balances between the recency and frequency features of the workload, demonstrating that adaptation eliminates the need for the workload -specific pretuning that plagued many previous proposals to improve LRU. ARC's online adaptation will likely have benefits for real-life workloads due to their richness and variability with time. These workloads can contain long sequential I/Os or moving hot spots, changing frequency and scale of temporal locality and fluctuating between stable, repeating access patterns and patterns with transient clustered references. Like LRU, ARC is easy to implement, and its running time per request is essentially independent of the cache size. A real-life implementation revealed that ARC has a low space overhead---0.75 percent of the cache size. Also, unlike LRU, ARC is scanresistant in that it allows one-time sequential requests to pass through without poll
a Memory Management Expert. Currently, he is with Radware.
"... AMSQM: adaptive multiple super-page queue ..."
IEEE International Conference on Data Engineering BP-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free
"... Abstract — In a high-end database system, the execution concurrency level rises continuously in a multiprocessor environment due to the increase in number of concurrent transactions and the introduction of multi-core processors. A new challenge for buffer management to address is to retain its scala ..."
Abstract
- Add to MetaCart
Abstract — In a high-end database system, the execution concurrency level rises continuously in a multiprocessor environment due to the increase in number of concurrent transactions and the introduction of multi-core processors. A new challenge for buffer management to address is to retain its scalability in responding to the highly concurrent data processing demands and environment. The page replacement algorithm, a major component in the buffer management, can seriously degrade the system’s performance if the algorithm is not implemented in a scalable way. A lock-protected data structure is used in most replacement algorithms, where high contention is caused by concurrent accesses. A common practice is to modify a replacement algorithm to reduce the contention, such as to approximate the LRU replacement with the clock algorithm. Unfortunately, this type of modification usually hurts hit ratios of original algorithms. This problem may not exist or can be tolerated in an environment of low concurrency, thus has not been given enough attention for a long time. In this paper, instead of making a trade-off between the high hit ratio of a replacement algorithm and the low lock contention of its approximation, we propose a system framework, called BP-Wrapper, that (almost) eliminates lock contention for any replacement algorithm without requiring any changes to the algorithm. In BP-Wrapper, we use batching and prefetching techniques to reduce lock contention and to retain high hit ratio. The implementation of BP-Wrapper in PostgreSQL version 8.2 adds only about 300 lines of C code. It can increase the throughput up to two folds compared with the replacement algorithms with lock contention when running TPC-C-like and TPC-W-like workloads. I.
RESEARCH FEATURE Outperforming LRU with an Adaptive Replacement Cache Algorithm
"... The self-tuning, low-overhead, scan-resistant adaptive replacement cache algorithm outperforms the least-recently-used algorithm by dynamically responding to changing access patterns and continually balancing between workload recency and frequency features. ..."
Abstract
- Add to MetaCart
The self-tuning, low-overhead, scan-resistant adaptive replacement cache algorithm outperforms the least-recently-used algorithm by dynamically responding to changing access patterns and continually balancing between workload recency and frequency features.
PS-BC: Power-saving Considerations in Design of Buffer Caches Serving Heterogeneous Storage Devices
"... Under a replacement policy, existing operating systems identify and maintain most frequently used storage data in buffer caches located in main memory, aiming at low-latency I/O data accesses. However, replacement policies can also strongly affect energy consumptions of various connected storage dev ..."
Abstract
- Add to MetaCart
Under a replacement policy, existing operating systems identify and maintain most frequently used storage data in buffer caches located in main memory, aiming at low-latency I/O data accesses. However, replacement policies can also strongly affect energy consumptions of various connected storage devices, which has not been a consideration in the design and implementation of buffer cache management. In this paper, we present a system framework for an energy-aware buffer cache replacement, called PS-BC (power-saving buffer cache). By considering several critical factors affecting system energy consumption, PS-BC can effectively improve system energy efficiency, while it is able to flexibly incorporate conventional performance-oriented buffer cache replacement policies for different performance objectives. Our experimental studies based on a trace-driven simulation show that the PS-BC framework embedded with the CLOCK replacement policy can achieve an energy saving rate of up to 32.5% with a minimal overhead for various workloads.

