Results 1 - 10
of
132
Summary cache: A scalable wide-area web cache sharing protocol
, 1998
"... The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we propose a new protocol called "Summary Cache"; each proxy keeps a summary of t ..."
Abstract
-
Cited by 596 (2 self)
- Add to MetaCart
The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we propose a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries. Two factors contribute to the low overhead: the summaries are updated only periodically, and the summary representations are economical -- as low as 8 bits per entry. Using trace-driven simulations and a prototype implementation, we show that compared to the existing Internet Cache Protocol (ICP), Summary Cache reduces the number of inter-cache messages by a factor of 25 to 60, reduces the bandwidth consumption by over 50%, and eliminates between 30 % to 95 % of the CPU overhead, while at the same time maintaining almost the same hit ratio as ICP. Hence Summary Cache enables cache sharing among a large number of proxies.
Cost-Aware WWW Proxy Caching Algorithms
- IN PROCEEDINGS OF THE 1997 USENIX SYMPOSIUM ON INTERNET TECHNOLOGY AND SYSTEMS
, 1997
"... Web caches can not only reduce network traffic and downloading latency, but can also affect the distribution of web traffic over the network through costaware caching. This paper introduces GreedyDualSize, which incorporates locality with cost and size concerns in a simple and non-parameterized fash ..."
Abstract
-
Cited by 433 (6 self)
- Add to MetaCart
Web caches can not only reduce network traffic and downloading latency, but can also affect the distribution of web traffic over the network through costaware caching. This paper introduces GreedyDualSize, which incorporates locality with cost and size concerns in a simple and non-parameterized fashion for high performance. Trace-driven simulations show that with the appropriate cost definition, GreedyDual-Size outperforms existing web cache replacement algorithms in many aspects, including hit ratios, latency reduction and network cost reduction. In addition, GreedyDual-Size can potentially improve the performance of main-memory caching of Web documents.
A survey of web caching schemes for the internet
- ACM Computer Communication Review
, 1999
"... The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and serve ..."
Abstract
-
Cited by 200 (1 self)
- Add to MetaCart
The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and server overloading. Web caching has been recognized as one of the effective schemes to alleviate the service bottleneck and reduce the network traffic, thereby minimize the user access latency. In this paper, we first describe the elements of a Web caching system and its desirable properties. Then, we survey the state-of-art techniques which have been used in Web caching systems. Finally, we discuss the research frontier
ns Notes and Documentation
, 2000
"... This document (ns Notes and Documentation) provides reference documentation for ns. Although we begin with a simple simulation script, resources like Marc Greis's tutorial web pages (at http://titan.cs.uni-bonn.de/~greis/ns/ ns.html) or the slides from one of the ns tutorials are problably better pl ..."
Abstract
-
Cited by 167 (0 self)
- Add to MetaCart
This document (ns Notes and Documentation) provides reference documentation for ns. Although we begin with a simple simulation script, resources like Marc Greis's tutorial web pages (at http://titan.cs.uni-bonn.de/~greis/ns/ ns.html) or the slides from one of the ns tutorials are problably better places to begin for the ns novice.
Workload Characterization of the 1998 World Cup Web Site
- IEEE Network
, 1999
"... Web, workload characterization, performance, servers, caching, World Cup © Copyright Hewlett-Packard Company 1999 This paper presents a detailed workload characterization study of the 1998 World Cup Web site. Measurements from this site were collected over a three month period. During this time the ..."
Abstract
-
Cited by 157 (5 self)
- Add to MetaCart
Web, workload characterization, performance, servers, caching, World Cup © Copyright Hewlett-Packard Company 1999 This paper presents a detailed workload characterization study of the 1998 World Cup Web site. Measurements from this site were collected over a three month period. During this time the site received 1.35 billion requests, making this the largest Web workload analyzed to date. By examining this extremely busy site and through comparison with existing characterization studies we are able to determine how Web server workloads are evolving. We find that improvements in the caching architecture of the World-Wide Web are changing the workloads of Web servers, but that major improvements to that architecture are still necessary. In particular, we uncover evidence that a better consistency mechanism is required for World-Wide Web caches.
The Content and Access Dynamics of a Busy Web Site: Findings and Implications
, 2000
"... In this paper, we study the dynamics of the MSNBC news site, one of the busiest Web sites in the Internet today. Unlike many other efforts that have analyzed client accesses as seen by proxies, we focus on the server end. We analyze the dynamics of both the server content and client accesses made to ..."
Abstract
-
Cited by 104 (9 self)
- Add to MetaCart
In this paper, we study the dynamics of the MSNBC news site, one of the busiest Web sites in the Internet today. Unlike many other efforts that have analyzed client accesses as seen by proxies, we focus on the server end. We analyze the dynamics of both the server content and client accesses made to the server. The former considers the content creation and modification process while the latter considers page popularity and locality in client accesses. Some of our key results are: (a) files tend to change little when they are modified, (b) a small set of files tends to get modified repeatedly, (c) file popularity follows a Zipf-like distribution with a parameter ff that is much larger than reported in previous, proxy-based studies, and (d) there is significant temporal stability in file popularity but not much stability in the domains from which clients access the popular content. We discuss the implications of these findings for techniques such as Web caching (including cache consisten...
Beyond hierarchies: Design considerations for distributed caching on the internet
- in Proceedings of the 19th International Conference on Distributed Computing Systems (ICDCS
, 1998
"... Abstract In this paper, we examine several distributed caching strategies to improve the response time for accessing data over theInternet. By studying several Internet caches and workloads, we derive four basic design principles for large scale distributed ..."
Abstract
-
Cited by 100 (6 self)
- Add to MetaCart
Abstract In this paper, we examine several distributed caching strategies to improve the response time for accessing data over theInternet. By studying several Internet caches and workloads, we derive four basic design principles for large scale distributed
The ns Manual
, 2000
"... This document (ns Notes and Documentation) provides reference documentation for ns. Although we begin with a simple simulation script, resources like Marc Greis's tutorial web pages (originally at his web site, now at http://www.isi. edu/nsnam/ns/tutorial/) or the slides from one of the ns tutorials ..."
Abstract
-
Cited by 100 (0 self)
- Add to MetaCart
This document (ns Notes and Documentation) provides reference documentation for ns. Although we begin with a simple simulation script, resources like Marc Greis's tutorial web pages (originally at his web site, now at http://www.isi. edu/nsnam/ns/tutorial/) or the slides from one of the ns tutorials are problably better places to begin for the ns novice
Design considerations for distributed caching on the Internet
- In ICDCS
, 1999
"... In this paper, we describe the design and implementation of an integrated architecture for cache systems that scale to hundreds or thousands of caches with thousands to millions of users. Rather than simply try to maximize hit rates, we take an end-to-end approach to improving response time by also ..."
Abstract
-
Cited by 91 (17 self)
- Add to MetaCart
In this paper, we describe the design and implementation of an integrated architecture for cache systems that scale to hundreds or thousands of caches with thousands to millions of users. Rather than simply try to maximize hit rates, we take an end-to-end approach to improving response time by also considering hit times and miss times. We begin by studying several Internet caches and workloads, and we derive three core design principles for large scale distributed caches: (1) minimize the number of hops to locate and access data on both hits and misses, (2) share data among many users and scale to many caches, and (3) cache data close to clients. Our strategies for addressing these issues are built around a scalable, high-performance data-location service that tracks where objects are replicated. We describe how to construct such a service and how to use this service to provide direct access to remote data and push-based data replication. We evaluate our system through trace-driven simulation and find that these strategies together provide response time speedups of 1.27 to 2.43 compared to a traditional three-level cache hierarchy for a range of trace workloads and simulated environments. 1.
Improving end-to-end performance of the web using server volumes and proxy lters
- Tech. Rep. 980206-01, AT&T Labs { Research
, 1998
"... The rapid growth of the World Wide Web has caused serious performance degradation on the Internet. This paper o ers an end-to-end approach to improving Web performance by collectively examining the Web components { clients, proxies, servers, and the network. Our goal is to reduce userperceived laten ..."
Abstract
-
Cited by 91 (13 self)
- Add to MetaCart
The rapid growth of the World Wide Web has caused serious performance degradation on the Internet. This paper o ers an end-to-end approach to improving Web performance by collectively examining the Web components { clients, proxies, servers, and the network. Our goal is to reduce userperceived latency and the number of TCP connections, improve cache coherency and cache replacement, and enable prefetching of resources that are likely to be accessed in the near future. In our scheme, server response messages include piggybacked information customized to the requesting proxy. Our enhancement to the existing request-response protocol does not require per-proxy state at a server, and a very small amount of transient per-server state at the proxy, and can be implemented without changes to HTTP 1.1. The server groups related resources into volumes (based on access patterns and the le system's directory structure) and applies a proxy-generated lter (indicating the type of information of interest to the proxy) to tailor the piggyback information. We present e cient data structures for constructing server volumes and applying proxy lters, and a transparent way to perform volume maintenance and piggyback generation at a router along the path between the proxy and the server. We demonstrate the e ectiveness of our end-to-end approach byevaluating various volume construction and ltering techniques across a collection of large client and server logs.

