Results 1 - 10
of
55
Web servers under overload: How scheduling can help
, 2003
"... Most well-managed web servers perform well most of the time. Occasionally, however, every popular web server experiences transient overload. An overloaded web server typically displays signs of its affliction within a few seconds. Work enters the web server at a greater rate than the web server can ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
Most well-managed web servers perform well most of the time. Occasionally, however, every popular web server experiences transient overload. An overloaded web server typically displays signs of its affliction within a few seconds. Work enters the web server at a greater rate than the web server can complete it, causing the number of connections at the server to build up. This implies large delays for clients accessing the server. This paper provides a systematic performance study of exactly what happens when a web server is run under transient overload, both from the perspective of the server and from the perspective of the client. Second, this paper proposes and evaluates a particular kernel-level solution for improving the performance of web servers under overload. The solution is based on SRPT connection scheduling. We show that SRPT-based scheduling improves overload performance across a variety of client and server-oriented metrics.
Analysis of LAS Scheduling for Job Size Distributions with High Variance
, 2003
"... Recent studies of Internet traffic have shown that flow size distributions often exhibit a high variability property in the sense that most of the flows are short and more than half of the total load is constituted by a small percentage of the largest flows. In the light of this observation, it is ..."
Abstract
-
Cited by 44 (8 self)
- Add to MetaCart
Recent studies of Internet traffic have shown that flow size distributions often exhibit a high variability property in the sense that most of the flows are short and more than half of the total load is constituted by a small percentage of the largest flows. In the light of this observation, it is interesting to revisit scheduling policies that are known to favor small jobs in order to quantify the benefit for small and the penalty for large jobs. Among all scheduling policies that do not require knowledge of job size, the least attained service (LAS) scheduling policy is known to favor small jobs the most.
Open Versus Closed: A Cautionary Tale
- In NSDI
, 2006
"... Workload generators may be classified as based on a closed system model, where new job arrivals are only triggered by job completions (followed by think time), or an open system model, where new jobs arrive independently of job completions. In general, system designers pay little attention to whethe ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
Workload generators may be classified as based on a closed system model, where new job arrivals are only triggered by job completions (followed by think time), or an open system model, where new jobs arrive independently of job completions. In general, system designers pay little attention to whether a workload generator is closed or open. Using a combination of implementation and simulation experiments, we illustrate that there is a vast difference in behavior between open and closed models in realworld settings. We synthesize these differences into eight simple guiding principles, which serve three purposes. First, the principles specify how scheduling policies are impacted by closed and open models, and explain the differences in user level performance. Second, the principles motivate the use of partly open system models, whose behavior we show to lie between that of closed and open models. Finally, the principles provide guidelines to system designers for determining which system model is most appropriate for a given workload. 1
SWIFT: Scheduling in Web Servers for Fast Response Time
, 2003
"... This paper addresses the problem of how to service web requests quickly in order to minimize the client response time. Some of the recent work uses the idea of the Shortest Remaining Processing Time scheduling (SRPT) in Web servers in order to give preference to requests for short files. However, by ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
This paper addresses the problem of how to service web requests quickly in order to minimize the client response time. Some of the recent work uses the idea of the Shortest Remaining Processing Time scheduling (SRPT) in Web servers in order to give preference to requests for short files. However, by considering only the size of the file for determining the priority of requests, the previous works lack in capturing potentially useful scheduling information contained in the interaction between networks and end systems. To address this, this paper proposes and implements an algorithm, SWIFT, that focuses on both server and network characteristics in conjunction. Our approach prioritizes requests based on the size of the file requested and the distance of the client from the server. The implementation is at the kernel level for a finer-grained control over the packets entering the network. We present the results of the experiments conducted in a WAN environment to test the efficacy of SWIFT. The results show that for large-sized files, SWIFT shows an improvement of 2.5% - 10% over the SRPT scheme for the tested server loads.
Quantifying fairness in queueing systems: Principles and applications
- RUTCOR, Rutgers University
, 2004
"... In this paper we discuss fairness in queues, view it in the perspective of social justice at large and survey the recently published research work and publications dealing with the issue of measuring fairness of queues. The emphasis is placed on the underlying principles of the different measuring a ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
In this paper we discuss fairness in queues, view it in the perspective of social justice at large and survey the recently published research work and publications dealing with the issue of measuring fairness of queues. The emphasis is placed on the underlying principles of the different measuring approaches, on reviewing their methodology and on examining their applicability and intuitive appeal. Some quantitative results are also presented. The paper has three major parts (sections) and a short concluding discussion. In the first part, fairness in queues and its importance are discussed in the broader context of the prevailing conception of social justice at large. A special effort, including illustrative examples, is made to differentiate between fairness of the queue and fairness at large, which derives from favoring the more needy. The second part is dedicated to explaining and discussing the three main properties expected of a fairness measure: conformity to the general concept of social justice, granularity, and intuitive appeal and rationality. The third part reviews the fairness of the queue evaluation and
Workload-aware load balancing for clustered web servers
- IEEE Trans. Parallel Distrib. Syst
, 2005
"... Abstract—We focus on load balancing policies for homogeneous clustered Web servers that tune their parameters on-the-fly to adapt to changes in the arrival rates and service times of incoming requests. The proposed scheduling policy, ADAPTLOAD, monitors the incoming workload and self-adjusts its bal ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
Abstract—We focus on load balancing policies for homogeneous clustered Web servers that tune their parameters on-the-fly to adapt to changes in the arrival rates and service times of incoming requests. The proposed scheduling policy, ADAPTLOAD, monitors the incoming workload and self-adjusts its balancing parameters according to changes in the operational environment such as rapid fluctuations in the arrival rates or document popularity. Using actual traces from the 1998 World Cup Web site, we conduct a detailed characterization of the workload demands and demonstrate how online workload monitoring can play a significant part in meeting the performance challenges of robust policy design. We show that the proposed load balancing policy based on statistical information derived from recent workload history provides similar performance benefits as locality-aware allocation schemes, without requiring locality data. Extensive experimentation indicates that ADAPTLOAD results in an effective scheme, even when servers must support both static and dynamic Web pages. Index Terms—Clustered Web servers, self-managing clusters, load balance, locality awareness, workload characterization, static and dynamic pages. 1
Handling Multiple Bottlenecks in Web Servers Using Adaptive Inbound Controls
, 2002
"... Web servers become overloaded when one or several server resources are overutilized. In this paper we present an adaptive architecture that prevents resource overutilization in web servers by performing admission control based on application-level information found in HTTP headers and knowledge abou ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Web servers become overloaded when one or several server resources are overutilized. In this paper we present an adaptive architecture that prevents resource overutilization in web servers by performing admission control based on application-level information found in HTTP headers and knowledge about resource consumption of requests. In addition, we use an ecient early discard mechanism that consumes only a small amount of resources when rejecting requests. This mechanism first comes into play when the request rate is very high in order to avoid making uninformed request rejections that might abort ongoing sessions. We present our dual admission control architecture and various experiments that show that it can sustain high throughput and low response times even during high load.
Size-based Scheduling Policies with Inaccurate Scheduling Information
- In Proc. of IEEE Mascots
, 2004
"... Size-based scheduling policies such as SRPT have been studied since 1960s and have been applied in various arenas including packet networks and web server scheduling. SRPT has been proven to be optimal in the sense that it yields---compared to any other conceivable strategy---the smallest mean v ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
Size-based scheduling policies such as SRPT have been studied since 1960s and have been applied in various arenas including packet networks and web server scheduling. SRPT has been proven to be optimal in the sense that it yields---compared to any other conceivable strategy---the smallest mean value of occupancy and therefore also of waiting and delay time. One important prerequisite to applying size-based scheduling is to know the sizes of all jobs in advance, which are unfortunately not always available.
Analysis of join-theshortest-queue routing for web server farms
- In PERFORMANCE 2007. IFIP WG 7.3 International Symposium on Computer Modeling, Measurement and Evaluation
, 2007
"... ..."
The foreground-background queue: a survey
, 2006
"... Computer systems researchers have begun to apply the Foreground-Background (FB) schedul-ing discipline to a variety of applications, and as a result, there has been a resurgence in theo-retical research studying FB. In this paper, we bring together results from both of these research streams to prov ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
Computer systems researchers have begun to apply the Foreground-Background (FB) schedul-ing discipline to a variety of applications, and as a result, there has been a resurgence in theo-retical research studying FB. In this paper, we bring together results from both of these research streams to provide a survey of state-of-the-art theoretical results characterizing the performance of FB. Our emphasis throughout is on the impact of these results on computer systems.

