Results 1 - 10
of
19
Summary cache: A scalable wide-area web cache sharing protocol
, 1998
"... The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we propose a new protocol called "Summary Cache"; each proxy keeps a summary of t ..."
Abstract
-
Cited by 596 (2 self)
- Add to MetaCart
The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we propose a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries. Two factors contribute to the low overhead: the summaries are updated only periodically, and the summary representations are economical -- as low as 8 bits per entry. Using trace-driven simulations and a prototype implementation, we show that compared to the existing Internet Cache Protocol (ICP), Summary Cache reduces the number of inter-cache messages by a factor of 25 to 60, reduces the bandwidth consumption by over 50%, and eliminates between 30 % to 95 % of the CPU overhead, while at the same time maintaining almost the same hit ratio as ICP. Hence Summary Cache enables cache sharing among a large number of proxies.
On the Scale and Performance of Cooperative Web Proxy Caching
- ACM Symposium on Operating Systems Principles
, 1999
"... While algorithms for cooperative proxy caching have been widely studied, little is understood about cooperative-caching performance in the large-scale World Wide Web environment. This paper uses both trace-based analysis and analytic modelling to show the potential advantages and drawbacks of inter- ..."
Abstract
-
Cited by 250 (15 self)
- Add to MetaCart
While algorithms for cooperative proxy caching have been widely studied, little is understood about cooperative-caching performance in the large-scale World Wide Web environment. This paper uses both trace-based analysis and analytic modelling to show the potential advantages and drawbacks of inter-proxy cooperation. With our traces, we evaluate quantitatively the performance-improvement potential of cooperation between 200 small-organization proxies within a university environment, and between two large-organization proxies handling 23,000 and 60,000 clients, respectively. With our model, we extend beyond these populations to project cooperative caching behavior in regions with millions of clients. Overall, we demonstrate that cooperative caching has performance benefits only within limited population bounds. We also use our model to examine the implications of future trends in Web-access behavior and traffic.
Giggle: A Framework for Constructing Scalable Replica Location Services
, 2002
"... In wide area computing systems, it is often desirable to create remote read-only copies (replicas) of files. Replication can be used to reduce access latency, improve data locality, and/or increase robustness, scalability and performance for distributed applications. We define a replica location ser ..."
Abstract
-
Cited by 122 (36 self)
- Add to MetaCart
In wide area computing systems, it is often desirable to create remote read-only copies (replicas) of files. Replication can be used to reduce access latency, improve data locality, and/or increase robustness, scalability and performance for distributed applications. We define a replica location service (RLS) as a system that maintains and provides access to information about the physical locations of copies. An RLS typically functions as one component of a data grid architecture. This paper makes the following contributions. First, we characterize RLS requirements. Next, we describe a parameterized architectural framework, which we name Giggle (for GIGa-scale Global Location Engine), within which a wide range of RLSs can be defined. We define several concrete instantiations of this framework with different performance characteristics. Finally, we present initial performance results for an RLS prototype, demonstrating that RLS systems can be constructed that meet performance goals.
Beyond hierarchies: Design considerations for distributed caching on the internet
- in Proceedings of the 19th International Conference on Distributed Computing Systems (ICDCS
, 1998
"... Abstract In this paper, we examine several distributed caching strategies to improve the response time for accessing data over theInternet. By studying several Internet caches and workloads, we derive four basic design principles for large scale distributed ..."
Abstract
-
Cited by 100 (6 self)
- Add to MetaCart
Abstract In this paper, we examine several distributed caching strategies to improve the response time for accessing data over theInternet. By studying several Internet caches and workloads, we derive four basic design principles for large scale distributed
Design considerations for distributed caching on the Internet
- In ICDCS
, 1999
"... In this paper, we describe the design and implementation of an integrated architecture for cache systems that scale to hundreds or thousands of caches with thousands to millions of users. Rather than simply try to maximize hit rates, we take an end-to-end approach to improving response time by also ..."
Abstract
-
Cited by 91 (17 self)
- Add to MetaCart
In this paper, we describe the design and implementation of an integrated architecture for cache systems that scale to hundreds or thousands of caches with thousands to millions of users. Rather than simply try to maximize hit rates, we take an end-to-end approach to improving response time by also considering hit times and miss times. We begin by studying several Internet caches and workloads, and we derive three core design principles for large scale distributed caches: (1) minimize the number of hops to locate and access data on both hits and misses, (2) share data among many users and scale to many caches, and (3) cache data close to clients. Our strategies for addressing these issues are built around a scalable, high-performance data-location service that tracks where objects are replicated. We describe how to construct such a service and how to use this service to provide direct access to remote data and push-based data replication. We evaluate our system through trace-driven simulation and find that these strategies together provide response time speedups of 1.27 to 2.43 compared to a traditional three-level cache hierarchy for a range of trace workloads and simulated environments. 1.
Cache Digests
- Computer Networks and ISDN Systems
, 1998
"... This paper presents Cache Digest, a novel protocol and optimization technique for cooperative Web caching. Cache Digest allows proxies to make information about their cache contents available to peers in a compact form. A peer uses digests to identify neighbors that are likely to have a given docume ..."
Abstract
-
Cited by 86 (0 self)
- Add to MetaCart
This paper presents Cache Digest, a novel protocol and optimization technique for cooperative Web caching. Cache Digest allows proxies to make information about their cache contents available to peers in a compact form. A peer uses digests to identify neighbors that are likely to have a given document. Cache Digest is a promising alternative to traditional per-request query/reply schemes such as ICP. We discuss the design ideas behind Cache Digest and its implementation in the Squid proxy cache. The performance of Cache Digest is compared to ICP using real-world Web caches operated by NLANR. Our analysis shows that Cache Digest outperforms ICP in several categories. Finally, we outline improvements to the techniques we are currently working on. 1 Introduction One of the most difficult problems in the design of Web cache hierarchies is efficiently locating objects held in neighbor caches. When a cache needs to forward a request, how does it know whether to use a sibling, a parent, or p...
Web Caching and Content Distribution: A View From the Interior
- COMPUTER COMMUNICATIONS
, 2000
"... Research in Web caching has yielded analytical tools to model the behavior of large-scale Web caches. Recently, Wolman et al. have proposed an analytical model and used it to evaluate the potential of cooperative Web proxy caching for large populations. This paper shows how to apply the Wolman mode ..."
Abstract
-
Cited by 54 (4 self)
- Add to MetaCart
Research in Web caching has yielded analytical tools to model the behavior of large-scale Web caches. Recently, Wolman et al. have proposed an analytical model and used it to evaluate the potential of cooperative Web proxy caching for large populations. This paper shows how to apply the Wolman model to study the behavior of interior cache servers in multi-level caching systems. Focusing on interior caches gives a different perspective on the model's implications, and it allows three new uses of the model. First, we apply the model to large-scale caching systems in which the interior nodes belong to third-party content distribution services. Second, we explore the effectiveness of content distribution services as conventional Web proxy caching becomes more prevalent. Finally, we correlate the model's predictions of interior cache behavior with empirical observations from the root caches of the NLANR cache hierarchy.
A Taste of Crispy Squid
- In Proceedings of the Workshop on Internet Server Performance
, 1998
"... Distributed proxy caches are in use throughout the world to reduce access latency and bandwidth demands for Internet object transfer. The CRISP project seeks to build more effective distributed Web caches by exploring alternatives to the hierarchical structure and multicast handling of probes common ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
Distributed proxy caches are in use throughout the world to reduce access latency and bandwidth demands for Internet object transfer. The CRISP project seeks to build more effective distributed Web caches by exploring alternatives to the hierarchical structure and multicast handling of probes common to the most popular distributed Web cache systems. CRISP caches are structured as a collective of autonomous Web proxy servers sharing their cache directories through a common mapping service that can be queried with at most one message exchange. Individual servers may be configured to replicate all or part of the global map in order to balance access cost, overhead and hit ratio, depending on the size and geographic dispersion of the collective cache. We have prototyped several CRISP cache structures in Crispy Squid, an extension to the Squid Internet Object Cache. We are evaluating these cache structures using Proxycizer, a full-featured package for replaying traces of observed request tr...
Directory Structures for Scalable Internet Caches
, 1997
"... Use of Internet caches is a cheap and effective way to improve performance for all Internet users. Distributed caches offer the potential to serve larger user communities and to deliver higher hit ratios on shared Web documents. The key to building effective distributed caches is a directory structu ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Use of Internet caches is a cheap and effective way to improve performance for all Internet users. Distributed caches offer the potential to serve larger user communities and to deliver higher hit ratios on shared Web documents. The key to building effective distributed caches is a directory structure that allows individual caching servers to locate objects cached at neighboring sites, combining them into a logically unified collective cache. This paper uses Web traces to evaluate a range of alternatives for managing directories in distributed Internet caches. We use trace-driven executions and simulations of prototype caches to compare multicastbased queries of local maps (Harvest) with unicast queries of a global map (CRISP). We then use properties of the traces to predict performance of CRISP variants in which the global map is partitioned or replicated. Finally, we propose a novel lazy CRISP structure based on weakly consistent replication of the most valuable subset of the global ...
A Decentralized, Adaptive Replica Location Mechanism
- In Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11
, 2002
"... We describe a decentralized, adaptive mechanism for replica location in wide-area distributed systems. Unlike traditional, hierarchical (e.g, DNS) and more recent (e.g., CAN, Chord, Gnutella) distributed search and indexing schemes, nodes in our location mechanism do not route queries, instead, they ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
We describe a decentralized, adaptive mechanism for replica location in wide-area distributed systems. Unlike traditional, hierarchical (e.g, DNS) and more recent (e.g., CAN, Chord, Gnutella) distributed search and indexing schemes, nodes in our location mechanism do not route queries, instead, they organize into an overlay network and distribute location information. We contend that this approach works well in environments where replica location queries are prevalent but the dynamic component of the system (e.g., node and network failures, replica add/delete operations) cannot be neglected. We argue that a replica location mechanism that combines probabilistic representations of replica location information with soft-state protocols and a flat overlay network of nodes brings important benefits: genuine decentralization, low query latency, and flexibility to introduce adaptive communication schedules. We support these claims in two ways. First, we provide a rough resource consumption evaluation: we show that, for environments similar to those encountered in large scientific data analysis projects, generated network traffic is limited and, more importantly, is comparable to the traffic generated by a request routing scheme. Second, we provide encouraging performance data from a prototype implementation. 1.

