Results 1 - 10
of
25
A survey of web caching schemes for the internet
- ACM Computer Communication Review
, 1999
"... The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and serve ..."
Abstract
-
Cited by 200 (1 self)
- Add to MetaCart
The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and server overloading. Web caching has been recognized as one of the effective schemes to alleviate the service bottleneck and reduce the network traffic, thereby minimize the user access latency. In this paper, we first describe the elements of a Web caching system and its desirable properties. Then, we survey the state-of-art techniques which have been used in Web caching systems. Finally, we discuss the research frontier
Web Prefetching Between Low-Bandwidth Clients and Proxies: Potential and Performance
, 1999
"... The majority of the Internet population access the World Wide Web via dial-up modem connections. Studies have shown that the limited modem bandwidth is the main contributor to latency perceived by users. In this paper, we investigate one approach to reduce latency: prefetching between caching proxie ..."
Abstract
-
Cited by 91 (0 self)
- Add to MetaCart
The majority of the Internet population access the World Wide Web via dial-up modem connections. Studies have shown that the limited modem bandwidth is the main contributor to latency perceived by users. In this paper, we investigate one approach to reduce latency: prefetching between caching proxies and browsers. The approach relies on the proxy to predict which cached documents a user might reference next, and takes advantage of the idle time between user requests to push or pull the documents to the user. Using traces of modem Web accesses, we evaluate the potential of the technique at reducing client latency, examine the design of prediction algorithms, and investigate their performance varying the parameters and implementation concerns. Our results show that prefetching combined with large browser cache and delta-compression can reduce client latency up to 23.4%. The reduction is achieved using the Prediction-by-Partial-Matching (PPM) algorithm, whose accuracy ranges from 40% to ...
Prefetching Hyperlinks
, 1999
"... This paper develops a new method for prefetching Web pages into the client cache. Clients send reference information to Web servers, which aggregate the reference information in near-real-time and then disperse the aggregated information to all clients, piggybacked on GET responses. The information ..."
Abstract
-
Cited by 84 (0 self)
- Add to MetaCart
This paper develops a new method for prefetching Web pages into the client cache. Clients send reference information to Web servers, which aggregate the reference information in near-real-time and then disperse the aggregated information to all clients, piggybacked on GET responses. The information indicates how often hyperlink URLs embedded in pages have been previously accessed relative to the embedding page. Based on knowledge about which hyperlinks are generally popular, clients initiate prefetching of the hyperlinks and their embedded images according to any algorithm they prefer. Both client and server may cap the prefetching mechanism's space overhead and waste of network resources due to speculation. The result of these differences is improved prefetching: lower client latency (by 52.3%) and less wasted network bandwidth (24.0%).
Distributions of Surfers' Paths through the World Wide Web: Empirical Characterizations
- World Wide Web
, 1999
"... Surfing the World Wide Web (WWW) involves traversing hyperlink connections among documents. The ability to predict surfing patterns could solve many problems facing producers and consumers of WWW content. We analyzed WWW server logs for a WWW site, collected over ten days, to compare different path ..."
Abstract
-
Cited by 52 (2 self)
- Add to MetaCart
Surfing the World Wide Web (WWW) involves traversing hyperlink connections among documents. The ability to predict surfing patterns could solve many problems facing producers and consumers of WWW content. We analyzed WWW server logs for a WWW site, collected over ten days, to compare different path reconstruction methods and to investigate how past surfing behavior predicts future surfing choices. Since log files do not explicitly contain user paths, various methods have evolved to reconstruct user paths. Session times, number of clicks per visit, and Levenshtein Distance analyses were performed to show the impact of various reconstruction methods. Different methods for measuring surfing patterns were also compared. Markov model approximations were used to model the probability of users choosing links conditional on past surfing paths. Information-theoretic (entropy) measurements suggest that information is gained by using longer paths to estimate the conditional probability of link choice given surf path. The improvements diminish, however, as one increases the length of path beyond one. Information-theoretic (Total Divergence to the Average entropy) measurements suggest that the conditional probabilities of link choice given surf path are more stable over time for shorter paths than longer paths. Direct examination of the accuracy of the conditional probability models in predicting test data also suggests that shorter paths yield more stable models and can be estimated reliably with less data than longer paths. iii Keywords: WWW, paths, prediction, user modeling, log file analysis 1 1.
Analysis and Characterization of Large-Scale Web Server Access Patterns and Performance
- World Wide Web
, 1999
"... In this paper we develop a general methodology for characterizing the access patterns of Web server requests based on a time-series analysis of finite collections of observed data from real systems. Our approach is used together with the access logs from the IBM Web site for the Olympic Games to dem ..."
Abstract
-
Cited by 40 (7 self)
- Add to MetaCart
In this paper we develop a general methodology for characterizing the access patterns of Web server requests based on a time-series analysis of finite collections of observed data from real systems. Our approach is used together with the access logs from the IBM Web site for the Olympic Games to demonstrate some of its advantages over previous methods and to construct a particular class of benchmarks for large-scale heavily-accessed Web server environments. We then apply an instance of this class of benchmarks to analyze aspects of large-scale Web server performance, demonstrating some additional problems with commonly used methods to evaluate Web server performance at different request traffic intensities.
Alleviating the latency and bandwidth problems in www browsing
- Proceedings of the 1997 USENIX Symposium on Internet Technology and Systems
, 1997
"... This work addresses three problems that are associated with Web browsing: (a) low bandwidth available to the end user who is connected via slow modems or outdoor wireless networks, (b) long and variable latencies in document access, and (c) temporary disconnections of mobile users. Three techniques ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
This work addresses three problems that are associated with Web browsing: (a) low bandwidth available to the end user who is connected via slow modems or outdoor wireless networks, (b) long and variable latencies in document access, and (c) temporary disconnections of mobile users. Three techniques are used with a variety of heuristics in order to overcome these problems: (a) pro ling user and group access patterns and using these pro les in order to pre-fetch documents, (b) ltering HTTP requests and responses in order to reduce data transmission over bottleneck links, and (c) hoarding documents based on user pro les in order to support limited web browsing even during disconnection. In this paper, we describe the design and implementation of a WWW proxy-based system that incorporates the above techniques. We describe our experiences with the proxy system, and present performance results that show an improvement in the experience of Web browsing using this system. 1
Enabling Scalable Online Personalization on the Web
, 2000
"... Online personalization is of great interest to e-companies. Virtually all personalization technologies are based on the idea of storing as much historical customer session data as possible, and then querying the data store as customers navigate through a web site. The holy grail of on-line personali ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Online personalization is of great interest to e-companies. Virtually all personalization technologies are based on the idea of storing as much historical customer session data as possible, and then querying the data store as customers navigate through a web site. The holy grail of on-line personalization is an environment where fine-grained, detailed historical session data can be queried based on current online navigation patterns to formulate real-time responses. Unfortunately, as more consumers become e-shoppers, the user load and the amount of historical data continue to increase, causing scalability-related problems for almost all current personalization technologies. This paper chronicles the development of a real-time interaction management engine through the integration of historical data and on-line visitation patterns of e-commerce site visitors. This paper describes the scientific underpinnings of the system, as well as the architecture and a performance evaluation....
Measuring Similarity of Interests for Clustering Web-Users
- Proceedings of the 12th Australian Database Conference ADC 2001
, 2001
"... There has been an increased demand for understanding of web-users due to the web development and the increased number of web-based applications. Informative knowledge extracted from web user access patterns has been used for many applications, such as the prefetching of pages between clients and pro ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
There has been an increased demand for understanding of web-users due to the web development and the increased number of web-based applications. Informative knowledge extracted from web user access patterns has been used for many applications, such as the prefetching of pages between clients and proxies. This paper presents an approach for measuring similarity of interests among web users, based on the interest items collected from web user's access logs. A matrix-based algorithm is then developed to cluster web users such that the users in the same cluster are closely related with respect to the similarity measure. As an application example, a web document pre-fetching technique is proposed that utilize the similarity measure and clusters obtained. Experiments have been conducted and the results have shown that our clustering method is capable of clustering web users with similar interests, and the prefetching method is practical. 1.
Potential and Limits of Web Prefetching Between Low-Bandwidth Clients and Proxies
- IN PROCEEDINGS OF THE THIRD INTERNATIONAL WWW CACHING WORKSHOP
, 1998
"... The majority of the Internet population access the World Wide Web via dial-up modem connections. Studies have shown that limited modem bandwidth is the main contributing factor to the Web access latency perceived by the users. In this paper, we investigate one approach to reduce the user-perceived l ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
The majority of the Internet population access the World Wide Web via dial-up modem connections. Studies have shown that limited modem bandwidth is the main contributing factor to the Web access latency perceived by the users. In this paper, we investigate one approach to reduce the user-perceived latency: pre-pushing from the proxy to the browsers. The approach takes advantage of the idle time between user Web requests and uses prediction algorithms to predict what document a user might reference next. It then relies on proxies to send ("push") the documents to the user. Using existing modem Web access traces, we evaluate the potential of the technique at reducing user latency, examine the design of prediction algorithms and measure their accuracy as well as overhead, and evaluate the latency reduction of pre-push schemes using the algorithms. Our results show that with perfect predictors, proxybased Web pre-pushing with a 256K-byte pre-push buffer at the browser side can reduce lat...

