Results 1 - 10
of
52
Summary cache: A scalable wide-area web cache sharing protocol
, 1998
"... The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we propose a new protocol called "Summary Cache"; each proxy keeps a summary of t ..."
Abstract
-
Cited by 596 (2 self)
- Add to MetaCart
The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we propose a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries. Two factors contribute to the low overhead: the summaries are updated only periodically, and the summary representations are economical -- as low as 8 bits per entry. Using trace-driven simulations and a prototype implementation, we show that compared to the existing Internet Cache Protocol (ICP), Summary Cache reduces the number of inter-cache messages by a factor of 25 to 60, reduces the bandwidth consumption by over 50%, and eliminates between 30 % to 95 % of the CPU overhead, while at the same time maintaining almost the same hit ratio as ICP. Hence Summary Cache enables cache sharing among a large number of proxies.
Cache Digests
- Computer Networks and ISDN Systems
, 1998
"... This paper presents Cache Digest, a novel protocol and optimization technique for cooperative Web caching. Cache Digest allows proxies to make information about their cache contents available to peers in a compact form. A peer uses digests to identify neighbors that are likely to have a given docume ..."
Abstract
-
Cited by 86 (0 self)
- Add to MetaCart
This paper presents Cache Digest, a novel protocol and optimization technique for cooperative Web caching. Cache Digest allows proxies to make information about their cache contents available to peers in a compact form. A peer uses digests to identify neighbors that are likely to have a given document. Cache Digest is a promising alternative to traditional per-request query/reply schemes such as ICP. We discuss the design ideas behind Cache Digest and its implementation in the Squid proxy cache. The performance of Cache Digest is compared to ICP using real-world Web caches operated by NLANR. Our analysis shows that Cache Digest outperforms ICP in several categories. Finally, we outline improvements to the techniques we are currently working on. 1 Introduction One of the most difficult problems in the design of Web cache hierarchies is efficiently locating objects held in neighbor caches. When a cache needs to forward a request, how does it know whether to use a sibling, a parent, or p...
WebWave: Globally Load Balanced Fully Distributed Caching of Hot Published Documents
- In Proceedings of the 17th International Conference on Distributed Computing Systems
, 1997
"... Document publication service over such a large network as the Internet challenges us to harness available server and network resources to meet fast growing demand. In this paper, we show that large-scale dynamic caching can be employed to globally minimize server idle time, and hence maximize the ag ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
Document publication service over such a large network as the Internet challenges us to harness available server and network resources to meet fast growing demand. In this paper, we show that large-scale dynamic caching can be employed to globally minimize server idle time, and hence maximize the aggregate server throughput of the whole service. To be efficient, scalable and robust, a successful caching mechanism must have three properties: (1) maximize the global throughput of the system, (2) find cache copies without recourse to a directory service, or to a discovery protocol, and (3) be completely distributed in the sense of operating only on the basis of local information. In this paper, we develop a precise definition, which we call tree load-balance (TLB), of what it means for a mechanism to satisfy these three goals. We present an algorithm that computes TLB off-line, and a distributed protocol that induces a load distribution that converges quickly to a TLB one. Both algorithms...
Placement Algorithms for Hierarchical Cooperative Caching
, 1999
"... Consider a hierarchical network in which each node periodically issues a request for an object drawn from a fixed set of unit-size objects. Suppose further that the following conditions are satisfied: the frequency with which each node accesses each object is known; each node has a cache of known ca ..."
Abstract
-
Cited by 44 (7 self)
- Add to MetaCart
Consider a hierarchical network in which each node periodically issues a request for an object drawn from a fixed set of unit-size objects. Suppose further that the following conditions are satisfied: the frequency with which each node accesses each object is known; each node has a cache of known capacity; any cache can be accessed by any node; any request is satisfied by the closest node with a copy of the desired object, at a cost proportional to the distance between the accessing node and the closest copy. In such an environment, it is desirable to fill the available cache space with copies of objects in such a way that the average access cost is minimized. We provide both exact and approximate polynomial-time algorithms for this hierarchical placement problem. Our exact algorithm is based on a reduction to min-cost flow, and does not appear to be practical for large problem sizes. Thus we are motivated to search for a faster approximation algorithm. Our main result is a simple constant-factor approximation algorithm for the hierarchical placement problem that admits an efficient distributed implementation.
A Taste of Crispy Squid
- In Proceedings of the Workshop on Internet Server Performance
, 1998
"... Distributed proxy caches are in use throughout the world to reduce access latency and bandwidth demands for Internet object transfer. The CRISP project seeks to build more effective distributed Web caches by exploring alternatives to the hierarchical structure and multicast handling of probes common ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
Distributed proxy caches are in use throughout the world to reduce access latency and bandwidth demands for Internet object transfer. The CRISP project seeks to build more effective distributed Web caches by exploring alternatives to the hierarchical structure and multicast handling of probes common to the most popular distributed Web cache systems. CRISP caches are structured as a collective of autonomous Web proxy servers sharing their cache directories through a common mapping service that can be queried with at most one message exchange. Individual servers may be configured to replicate all or part of the global map in order to balance access cost, overhead and hit ratio, depending on the size and geographic dispersion of the collective cache. We have prototyped several CRISP cache structures in Crispy Squid, an extension to the Squid Internet Object Cache. We are evaluating these cache structures using Proxycizer, a full-featured package for replaying traces of observed request tr...
A Web caching primer
- IEEE Internet Computing
, 2001
"... This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by sending an email message to pubs-permissions@ieee.org.
Directory Structures for Scalable Internet Caches
, 1997
"... Use of Internet caches is a cheap and effective way to improve performance for all Internet users. Distributed caches offer the potential to serve larger user communities and to deliver higher hit ratios on shared Web documents. The key to building effective distributed caches is a directory structu ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Use of Internet caches is a cheap and effective way to improve performance for all Internet users. Distributed caches offer the potential to serve larger user communities and to deliver higher hit ratios on shared Web documents. The key to building effective distributed caches is a directory structure that allows individual caching servers to locate objects cached at neighboring sites, combining them into a logically unified collective cache. This paper uses Web traces to evaluate a range of alternatives for managing directories in distributed Internet caches. We use trace-driven executions and simulations of prototype caches to compare multicastbased queries of local maps (Harvest) with unicast queries of a global map (CRISP). We then use properties of the traces to predict performance of CRISP variants in which the global map is partitioned or replicated. Finally, we propose a novel lazy CRISP structure based on weakly consistent replication of the most valuable subset of the global ...
One to Many Reliable Bulk-Data Transfer in the MBone
- Proceedings of the Third International Workshop on High Performance Protocol Architectures, HIPPARCH '97
, 1997
"... In this paper we depict and evaluate the performance of a protocol for reliable bulkdata transfer from one sender to many receivers simultaneously, using the Internet multicast infrastructure (MBone). The protocol, featuring a TCP-friendly congestion control algorithm, has been designed aiming at ac ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
In this paper we depict and evaluate the performance of a protocol for reliable bulkdata transfer from one sender to many receivers simultaneously, using the Internet multicast infrastructure (MBone). The protocol, featuring a TCP-friendly congestion control algorithm, has been designed aiming at achieving a complete scalability of the system with the respect to the number of receivers. For this reason both reliability and congestion control are carried out by receivers, avoiding to involve the sender in the congestion control feedback loop and relieving it from the burden of carrying out retransmission for all receivers. The receiver-driven congestion control algorithm presented is based on a redundant layered organisation of data, where redundancy is used to spread the information being sent into a large number of data units, providing receivers with flexibility in accepting some of them rather than others, and still being able to complete the reception. Reliability is based on a pro...
A Secure, Publisher-Centric Web Caching Infrastructure
"... The current Web cache infrastructure, though it has a number of performance benefits, does not address many of the publishers ’ requirements. We argue that web caches should be enhanced to address publishers’ needs. For example, caches will need to log client accesses, run scripts to dynamically pro ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
The current Web cache infrastructure, though it has a number of performance benefits, does not address many of the publishers ’ requirements. We argue that web caches should be enhanced to address publishers’ needs. For example, caches will need to log client accesses, run scripts to dynamically produce content, and give publishers QoS guarantees. In this paper, we propose Gemini, a publisher-centric web caching infrastructure. Central to our design is the architectural assumption that the global web cache infrastructure will be heterogeneous, like the Internet itself—caches will belong to many different administrative domains and have different functionalities. The heterogeneous aspect of the infrastructure raises several issues. For example, because caches can alter content, traditional end-to-end security mechanisms can no longer ensure the integrity and authenticity of content. In this paper, we study issues associated with designing such a publish-centric caching infrastructure. In particular, we propose a security architecture that protects publishers and caches from each other in a heterogeneous caching environment. In our design, we ensure that Gemini is incrementally deployable and seamlessly interoperates with the existing caching infrastructure. Along with a system design, we also present experience gained from implementation and preliminary performance results.

