Results 1 - 10
of
275
Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes
, 1996
"... Recently the notion of self-similarity has been shown to apply to wide-area and local-area network traffic. In this paper we examine the mechanisms that give rise to the self-similarity of network traffic. We present a hypothesized explanation for the possible self-similarity of traffic by using a p ..."
Abstract
-
Cited by 1023 (22 self)
- Add to MetaCart
Recently the notion of self-similarity has been shown to apply to wide-area and local-area network traffic. In this paper we examine the mechanisms that give rise to the self-similarity of network traffic. We present a hypothesized explanation for the possible self-similarity of traffic by using a particular subset of wide area traffic: traffic due to the World Wide Web (WWW). Using an extensive set of traces of actual user executions of NCSA Mosaic, reflecting over half a million requests for WWW documents, we examine the dependence structure of WWW traffic. While our measurements are not conclusive, we show evidence that WWW traffic exhibits behavior that is consistent with self-similar traffic models. Then we show that the self-similarity insuch traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in le transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network. To do this we rely on empirically measured distributions both from our traces and from data independently collected at over thirty WWW sites.
Generating Representative Web Workloads for Network and Server Performance Evaluation
, 1997
"... One role for workload generation is as a means for understanding how servers and networks respond to variation in load. This enables management and capacity planning based on current and projected usage. This paper applies a number of observations of Web server usage to create a realistic Web worklo ..."
Abstract
-
Cited by 681 (8 self)
- Add to MetaCart
One role for workload generation is as a means for understanding how servers and networks respond to variation in load. This enables management and capacity planning based on current and projected usage. This paper applies a number of observations of Web server usage to create a realistic Web workload generation tool which mimics a set of real users accessing a server. The tool, called Surge (Scalable URL Reference Generator) generates references matching empirical measurements of 1) server file size distribution; 2) request size distribution; 3) relative file popularity; 4) embedded file references; 5) temporal locality of reference; and 6) idle periods of individual users. This paper reviews the essential elements required in the generation of a representative Web workload. It also addresses the technical challenges to satisfying this large set of simultaneous constraints on the properties of the reference stream, the solutions we adopted, and their associated accuracy. Finally, we present evidence that Surge exercises servers in a manner significantly different from other Web server benchmarks.
On the Scale and Performance of Cooperative Web Proxy Caching
- ACM Symposium on Operating Systems Principles
, 1999
"... While algorithms for cooperative proxy caching have been widely studied, little is understood about cooperative-caching performance in the large-scale World Wide Web environment. This paper uses both trace-based analysis and analytic modelling to show the potential advantages and drawbacks of inter- ..."
Abstract
-
Cited by 250 (15 self)
- Add to MetaCart
While algorithms for cooperative proxy caching have been widely studied, little is understood about cooperative-caching performance in the large-scale World Wide Web environment. This paper uses both trace-based analysis and analytic modelling to show the potential advantages and drawbacks of inter-proxy cooperation. With our traces, we evaluate quantitatively the performance-improvement potential of cooperation between 200 small-organization proxies within a university environment, and between two large-organization proxies handling 23,000 and 60,000 clients, respectively. With our model, we extend beyond these populations to project cooperative caching behavior in regions with millions of clients. Overall, we demonstrate that cooperative caching has performance benefits only within limited population bounds. We also use our model to examine the implications of future trends in Web-access behavior and traffic.
Flash: An efficient and portable Web server
, 1999
"... This paper presents the design of a new Web server architecture called the asymmetric multiprocess event-driven (AMPED) architecture, and evaluates the performance of an implementation of this architecture, the Flash Web server. The Flash Web server combines the high performance of single-process ev ..."
Abstract
-
Cited by 240 (23 self)
- Add to MetaCart
This paper presents the design of a new Web server architecture called the asymmetric multiprocess event-driven (AMPED) architecture, and evaluates the performance of an implementation of this architecture, the Flash Web server. The Flash Web server combines the high performance of single-process event-driven servers on cached workloads with the performance of multi-process and multithreaded servers on disk-bound workloads. Furthermore, the Flash Web server is easily portable since it achieves these results using facilities available in all modern operating systems. The performance of different Web server architectures is evaluated in the context of a single implementation in order to quantify the impact of a server's concurrency architecture on its performance. Furthermore, the performance of Flash is compared with two widely-used Web servers, Apache and Zeus. Results indicate that Flash can match or exceed the performance of existing Web servers by up to 50 % across a wide range of real workloads. We also present results that show the contribution of various optimizations embedded in Flash.
An Empirical Model of HTTP Network Traffic
, 1997
"... The workload of the global Internet is dominated by the Hypertext Transfer Protocol (HTTP), an application protocol used by World Wide Web clients and servers. Simulation studies of this environment will require a model of the traffic patterns of the World Wide Web, in order to investigate the perfo ..."
Abstract
-
Cited by 210 (1 self)
- Add to MetaCart
The workload of the global Internet is dominated by the Hypertext Transfer Protocol (HTTP), an application protocol used by World Wide Web clients and servers. Simulation studies of this environment will require a model of the traffic patterns of the World Wide Web, in order to investigate the performance aspects of this increasingly popular application. We have developed an empirical model of network traffic produced by HTTP. Instead of relying on server or client logs, our approach is based on gathering packet traces of HTTP network conversations. Through traffic analysis, we have determined statistics and distributions for higher-level quantities such as the size of HTTP items retrieved, the number of items per "Web page", think time, and user browsing behavior. These quantities form a model can then be used by simulations to mimic World Wide Web network applications in wide-area IP internetworks. Keywords: World Wide Web, HTTP, traffic model, traffic measurements, workload, Interne...
On the Relationship Between File Sizes, Transport Protocols, and Self-Similar Network Traffic
- In Proc. IEEE International Conference on Network Protocols
, 1996
"... Recent measurements of local-area and wide-area traffic have shown that network traffic exhibits variability at a wide range of scales. In this paper, we examine a mechanism that gives rise to self-similar network traffic and present some of its performance implications. The mechanism we study is th ..."
Abstract
-
Cited by 193 (21 self)
- Add to MetaCart
Recent measurements of local-area and wide-area traffic have shown that network traffic exhibits variability at a wide range of scales. In this paper, we examine a mechanism that gives rise to self-similar network traffic and present some of its performance implications. The mechanism we study is the transfer of files or messages whose size is drawn from a heavy-tailed distribution. First, we show that in a “realistic ” client/server network environment—i.e., one with bounded resources and coupling among traffic sources competing for resources—the degree to which file sizes are heavy-tailed can directly determine the degree of traffic self-similarity at the link level. We show that this causal relationship is robust with respect to changes in network resources (bottleneck bandwidth and
SPAND: Shared Passive Network Performance Discovery
- IN USENIX SYMPOSIUM ON INTERNET TECHNOLOGIES AND SYSTEMS
, 1997
"... In the Internet today, users and applications must often make decisions based on the performance they expect to receive from other Internet hosts. For example, users can often view many Web pages in low-bandwidth or high-bandwidth versions, while other pages present users with long lists of mirror s ..."
Abstract
-
Cited by 190 (8 self)
- Add to MetaCart
In the Internet today, users and applications must often make decisions based on the performance they expect to receive from other Internet hosts. For example, users can often view many Web pages in low-bandwidth or high-bandwidth versions, while other pages present users with long lists of mirror sites to chose from. Current techniques to perform these decisions are often ad hoc or poorly designed. The most common solution used today is to require the user to manually make decisions based on their own experience and whatever information is provided by the application. Previous efforts to automate this decision-making process have relied on isolated, active network probes from a host. Unfortunately, this method of making measurements has several problems. Active probing introduces unnecessary network traffic that can quickly become a significant part of the total traffic handled by busy Web servers. Probing from a single host results in less accurate information and more redundant network probes than a system that shares information with nearby hosts. In this paper, we propose a system called SPAND (Shared Passive Network Performance Discovery) that determines network characteristics by making shared, passive measurements from a collection of hosts. In this paper, we show why using passive measurements from a collection of hosts has advantages over using active measurements from a single host. We also show that sharing measurements can significantly increase the accuracy and timeliness of predictions. In addition, we present a initial prototype design of SPAND, the current implementation status of our system, and initial performance results that show the potential benefits of SPAND.
Potential benefits of delta encoding and data compression for HTTP (Corrected version)
, 1997
"... ..."
Removal Policies in Network Caches for World-Wide Web Documents
, 1996
"... World-Wide Web proxy servers that cache documents can potentially reduce three quantities: the number of requests that reach popular servers, the volume of network traffic resulting from document requests, and the latency that an end-user experiences in retrieving a document. This paper examines the ..."
Abstract
-
Cited by 186 (4 self)
- Add to MetaCart
World-Wide Web proxy servers that cache documents can potentially reduce three quantities: the number of requests that reach popular servers, the volume of network traffic resulting from document requests, and the latency that an end-user experiences in retrieving a document. This paper examines the first two using the measures of cache hit rate (or fraction of client-requested URLs returned by the proxy) and weighted cache hit rate (or fraction of client-requested bytes returned by the proxy). A client request for an uncached document may cause the removal of one or more cached documents. Variable document sizes and types allow a rich variety of policies to select a document for removal, in contrast to policies for CPU caches or demand paging, that manage homogeneous objects. We present a taxonomy of removal policies. Through trace-driven simulation, we determine the maximum possible hit rate and weighted hit rate that a cache could ever achieve, and the removal policy that maximizes hi...

