Results 1 - 10
of
43
Cooperative caching for chip multiprocessors
- In Proceedings of the 33nd Annual International Symposium on Computer Architecture
, 2006
"... Chip multiprocessor (CMP) systems have made the on-chip caches a critical resource shared among co-scheduled threads. Limited off-chip bandwidth, increasing on-chip wire delay, destructive inter-thread interference, and diverse workload characteristics pose key design challenges. To address these ch ..."
Abstract
-
Cited by 87 (1 self)
- Add to MetaCart
Chip multiprocessor (CMP) systems have made the on-chip caches a critical resource shared among co-scheduled threads. Limited off-chip bandwidth, increasing on-chip wire delay, destructive inter-thread interference, and diverse workload characteristics pose key design challenges. To address these challenge, we propose CMP cooperative caching (CC), a unified framework to efficiently organize and manage on-chip cache resources. By forming a globally managed, shared cache using cooperative private caches. CC can effectively support two important caching applications: (1) reduction of average memory access latency and (2) isolation of destructive inter-thread interference. CC reduces the average memory access latency by balancing between cache latency and capacity opti-mizations. Based private caches, CC naturally exploits their access latency benefits. To improve the effective cache capacity, CC forms a “shared ” cache using replication control and LRU-based global replacement policies. Via cooperation throttling, CC provides a spectrum of caching behaviors between the two extremes of private and shared caches, thus enabling dynamic adaptation to suit workload requirements. We show that CC can achieve a robust performance advantage over private and shared cache schemes across different processor, cache and memory configurations, and a wide selection of multithreaded and multiprogrammed
Coordinated Placement and Replacement for Large-Scale Distributed Caches
- IEEE Transactions on Knowledge and Data Engineering
, 1998
"... In a large-scale information system such as a digital library or the web, a set of distributed caches can improve their effectiveness by coordinating their data placement decisions. In this paper, we examine the design space for cooperative placement and replacement algorithms. Our main focus is on ..."
Abstract
-
Cited by 64 (7 self)
- Add to MetaCart
In a large-scale information system such as a digital library or the web, a set of distributed caches can improve their effectiveness by coordinating their data placement decisions. In this paper, we examine the design space for cooperative placement and replacement algorithms. Our main focus is on the placement algorithms, which attempt to solve the following problem: given a set of caches, the network distances between caches, and predictions of the access rates from each cache to a set of objects, determine where to place each object in order to minimize the average access cost. Replacement algorithms also attempt to minimize access cost, but they work by selecting which objects to evict when a cache miss occurs. Using simulation, we examine three practical cooperative placement algorithms including one that is provably close to optimal, and we compare these algorithms to the optimal placement algorithm and several cooperative and non-cooperative replacement algorithms. We draw fiv...
Three-level caching for efficient query processing in large web search engines
- In Proc. of the 14th Int. World Wide Web Conference
, 2005
"... Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple terabytes, a single query may require the processing of hundreds of megabytes or more of index data. To keep up with thi ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple terabytes, a single query may require the processing of hundreds of megabytes or more of index data. To keep up with this immense workload, large search engines employ clusters of hundreds or thousands of machines, and a number of techniques such as caching, index compression, and index and query pruning are used to improve scalability. In particular, two-level caching techniques cache results of repeated identical queries at the frontend, while index data for frequently used query terms are cached in each node at a lower level. We propose and evaluate a three-level caching scheme that adds an intermediate level of caching for additional performance gains. This intermediate level attempts to exploit frequently occurring pairs of terms by caching intersections or projections of the corresponding inverted lists. We propose and study several offline and online algorithms for the resulting weighted caching problem, which turns out to be surprisingly rich in structure. Our experimental evaluation based on a large web crawl and real search engine query log shows significant performance gains for the best schemes, both in isolation and in combination with the other caching levels. We also observe that a careful selection of cache admission and eviction policies is crucial for best overall performance.
Refreshment Policies for Web Content Caches
- In Proceedings of the Twentieth Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2001
, 2001
"... Web content caches are often placed between end-users and origin servers as a mean to reduce server load, network usage, and ultimately, user-perceived latency. Cached objects typically have associated expiration times, after which they are considered stale and must be validated with a remote server ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Web content caches are often placed between end-users and origin servers as a mean to reduce server load, network usage, and ultimately, user-perceived latency. Cached objects typically have associated expiration times, after which they are considered stale and must be validated with a remote server (origin or another cache) before they can be sent to a client. A considerable fraction of cache "hits" involve stale copies that turned out to be current. These validations of current objects have small message size, but nonetheless, often induce latency comparable to full-edged cache misses. Thus, the functionality of caches as a latency-reducing mechanism highly depends not only on content availability but also on its freshness. We propose policies for caches to proactively validate selected objects as they become stale, and thus allow for more client requests to be processed locally. Our policies operate within the existing protocols and exploit natural properties of request patterns such as frequency and recency. We evaluated and compared different policies using trace-based simulations.
Storage-Aware Caching: Revisiting Caching For Heterogeneous Storage Systems
, 2002
"... Modern storage environments are composed of a variety of devices with different performance characteristics. In this paper, we explore storage-aware caching algorithms, in which the file buffer replacement algorithm explicitly accounts for differences in performance across devices. We introduce a ne ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Modern storage environments are composed of a variety of devices with different performance characteristics. In this paper, we explore storage-aware caching algorithms, in which the file buffer replacement algorithm explicitly accounts for differences in performance across devices. We introduce a new family of storage-aware caching algorithms that partition the cache, with one partition per device. The algorithms set the partition sizes dynamically to balance work across the devices. Through simulation, we show that our storage-aware policies perform similarly to LANDLORD, a cost-aware algorithm previously shown to perform well in Web caching environments. We also demonstrate that partitions can be easily incorporated into the Clock replacement algorithm, thus increasing the likelihood of deploying cost-aware algorithms in modern operating systems.
Performance of Compressed Inverted List Caching in Search Engines
- In WWW
, 2008
"... Due to the rapid growth in the size of the web, web search engines are facing enormous performance challenges. The larger engines in particular have to be able to process tens of thousands of queries per second on tens of billions of documents, making query throughput a critical issue. To satisfy th ..."
Abstract
-
Cited by 25 (11 self)
- Add to MetaCart
Due to the rapid growth in the size of the web, web search engines are facing enormous performance challenges. The larger engines in particular have to be able to process tens of thousands of queries per second on tens of billions of documents, making query throughput a critical issue. To satisfy this heavy workload, search engines use a variety of performance optimizations including index compression, caching, and early termination. We focus on two techniques, inverted index compression and index caching, which play a crucial rule in web search engines as well as other high-performance information retrieval systems. We perform a comparison and evaluation of several inverted list compression algorithms, including new variants of existing algorithms that have not been studied before. We then evaluate different inverted list caching policies on large query traces, and finally study the possible performance benefits of combining compression and caching. The overall goal of this paper is to provide an updated discussion and evaluation of these two techniques, and to show how to select the best set of approaches and settings depending on parameter such as disk speed and main memory cache size.
Page Replacement for General Caching Problems
, 1999
"... Caching (paging) is a well-studied problem in online algorithms, usually studied under the assumption that all pages have a uniform size and a uniform fault cost (uni- form caching). However, recent applications related to the web involve situations in which pages can be of different sizes and cost ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Caching (paging) is a well-studied problem in online algorithms, usually studied under the assumption that all pages have a uniform size and a uniform fault cost (uni- form caching). However, recent applications related to the web involve situations in which pages can be of different sizes and costs. This general caching problem seems more intricate than the uniform version. In particular, the offline case itself is NP-hard. Only a few results exist for the general caching problem [8, 17]. This paper seeks to develop good offline page replacement policies for the general caching problem, with the hope that any insight gained here may lead to good online algorithms. Our first main result is that by using only a small amount of additional memory, say O(1) times the largest page size, we can obtain an O(1)-approximation to the general caching problem. Note that the largest page size is typically a very small fraction of the total cache size, say 1%. Our second result is that when no add...
DULO: An effective buffer cache management scheme to exploit both temporal and spatial localities
- In USENIX Conference on File and Storage Technologies (FAST
, 2005
"... Sequentiality of requested blocks on disks, or their spatial locality, is critical to the performance of disks, where the throughput of accesses to sequentially placed disk blocks can be an order of magnitude higher than that of accesses to randomly placed blocks. Unfortunately, spatial locality of ..."
Abstract
-
Cited by 23 (9 self)
- Add to MetaCart
Sequentiality of requested blocks on disks, or their spatial locality, is critical to the performance of disks, where the throughput of accesses to sequentially placed disk blocks can be an order of magnitude higher than that of accesses to randomly placed blocks. Unfortunately, spatial locality of cached blocks is largely ignored and only temporal locality is considered in system buffer cache management. Thus, disk performance for workloads without dominant sequential accesses can be seriously degraded. To address this problem, we propose a scheme called DULO (DUal LOcality), which exploits both temporal and spatial locality in buffer cache management. Leveraging the filtering effect of the buffer cache, DULO can influence the I/O request stream by making the requests passed to disk more sequential, significantly increasing the effectiveness of I/O scheduling and prefetching for disk performance improvements. DULO has been extensively evaluated by both tracedriven simulations and a prototype implementation in Linux 2.6.11. In the simulations and system measurements, various application workloads have been tested, including Web Server, TPC benchmarks, and scientific programs. Our experiments show that DULO can significantly increase system throughput and reduce program execution times. 1
Evaluating Server-Assisted Cache Replacement in the Web
- In Proceedings of the 6th European Symposium on Algorithms
, 1998
"... . 1 To reduce user-perceived latency in retrieving documents on the world wide web, a commonly used technique is caching both at the client's browser and more gainfully (due to sharing) at a proxy. The effectiveness of Web caching hinges on the replacement policy that determines the relative value ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
. 1 To reduce user-perceived latency in retrieving documents on the world wide web, a commonly used technique is caching both at the client's browser and more gainfully (due to sharing) at a proxy. The effectiveness of Web caching hinges on the replacement policy that determines the relative value of caching different objects. An important component of such policy is to predict next-request times. We propose a caching policy utilizing per-resource statistics of inter-request times. Such statistics can be collected either locally or at the server, and piggybacked to the proxy. Using various Web server logs, we compared existing cache replacement policies with our server-assisted schemes. The experiments show that utilizing the server knowledge of access patterns can greatly improve the effectiveness of proxy caches. Our experimental evaluation uses an interval caching model. Interval caching values the utility of a unit of cache storage as a function of time. Instead of the usual trad...
Exploiting regularities in Web traffic patterns for cache replacement
- Proceedings of 31st ACM STOC
, 1999
"... Caching web pages at proxies and in web servers' memories can greatly enhance performance. Proxy caching is known to reduce network load and both proxy and server caching can significantly decrease latency. web caching problems have different properties than traditional operating systems caching, an ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
Caching web pages at proxies and in web servers' memories can greatly enhance performance. Proxy caching is known to reduce network load and both proxy and server caching can significantly decrease latency. web caching problems have different properties than traditional operating systems caching, and cache replacement can benet by recognizing and exploiting these differences. We address two aspects of the predictability of traffic patterns: the overall load experienced by large proxy and web servers, and the distinct access patterns of individual pages. We formalize the notion of "cache load" under various replacement policies, including LRU and LFU, and demonstrate that the trace of a large proxy server exhibits regular load. Predictable load allows for improved design, analysis, and experimental evaluation of replacement policies. We provide a simple and (near)-optimal replacement policy when each page request has an associated distribution function on the next request time of the page. Without the predictable load assumption, no such online policy is possible and it is known that even obtaining an offline optimum is hard. For experiments, predictable load enables comparing and evaluating cache replacement policies using partial traces, containing requests made to only a subset of the pages. Our results are based on considering a simpler caching model which we call the interval caching model. We relate traditional and interval-caching policies under predictable load, and derive (near)-optimal replacement policies from their optimal interval-caching counterparts.

