Results 1 - 10
of
11
Resilient Peer-to-Peer Streaming
- IN PROC. OF IEEE ICNP
, 2003
"... We consider the problem of distributing "five" streaming media content to a potentially large and highly dynamic population of hosts. Peer-to-peer content distribution is attractive in this setting because the bandwidth available to serve content scales with demand. A key challenge, however, is maki ..."
Abstract
-
Cited by 124 (3 self)
- Add to MetaCart
We consider the problem of distributing "five" streaming media content to a potentially large and highly dynamic population of hosts. Peer-to-peer content distribution is attractive in this setting because the bandwidth available to serve content scales with demand. A key challenge, however, is making content distribution robust to peer transience. Our approach to providing robustness is to introduce redundancy, both in network paths and in data. We use multiple, diverse distribution trees to provide redundancy in network paths and multiple description coding (MDC) to provide redundancy in data. We present
Web servers under overload: How scheduling can help
, 2003
"... Most well-managed web servers perform well most of the time. Occasionally, however, every popular web server experiences transient overload. An overloaded web server typically displays signs of its affliction within a few seconds. Work enters the web server at a greater rate than the web server can ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
Most well-managed web servers perform well most of the time. Occasionally, however, every popular web server experiences transient overload. An overloaded web server typically displays signs of its affliction within a few seconds. Work enters the web server at a greater rate than the web server can complete it, causing the number of connections at the server to build up. This implies large delays for clients accessing the server. This paper provides a systematic performance study of exactly what happens when a web server is run under transient overload, both from the perspective of the server and from the perspective of the client. Second, this paper proposes and evaluates a particular kernel-level solution for improving the performance of web servers under overload. The solution is based on SRPT connection scheduling. We show that SRPT-based scheduling improves overload performance across a variety of client and server-oriented metrics.
Dynamic Surge Protection: An Approach to Handling Unexpected Workload Surges with Resource Actions that Have Lead Times
, 2003
"... Today's information technology departments have widely varying demands for resources due to unexpected surges in subscriber demands (e.g., a large response to a product promotion). Further complicating matters is that many resource actions done in response to surges (e.g., provisioning or de-provisi ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
Today's information technology departments have widely varying demands for resources due to unexpected surges in subscriber demands (e.g., a large response to a product promotion). Further complicating matters is that many resource actions done in response to surges (e.g., provisioning or de-provisioning an application server) have substantial delays (lead times) between initiating the resource action and its taking e#ect. This paper describes dynamic surge protection, an approach to handling unexpected workload surges in systems that have lead times for resource actions. Dynamic surge protection incorporates three technologies: adaptive short-term forecasting, on-line capacity planning, and configuration management. The paper includes empirical results from evaluations done on a research testbed, including favorable comparisons with a thresholdbased heuristic. The results from an extended test also show that service objectives can be maintained cost-e#ectively.
Experimental Evaluation of an Adaptive Flash Crowd Protection System
, 2003
"... Network early warning system (NEWS) is an adaptive flashcrowd protection system. Unlike approaches using manually configured request rate limit, NEWS regulates incoming requests by observing response performance, automatically adapting to changing tra#c mixes. We have previously studied NEWS perfor ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Network early warning system (NEWS) is an adaptive flashcrowd protection system. Unlike approaches using manually configured request rate limit, NEWS regulates incoming requests by observing response performance, automatically adapting to changing tra#c mixes. We have previously studied NEWS performance through simulation; this paper presents an implementation of NEWS on a Linux-based router. We evaluate this implementation in testbed experiments with HTTP server log recorded during a flash crowd. Our first contribution is to use implementation and testbed experiments to evaluate NEWS performance in a server memory-limited scenario, which was not considered in our previous simulation study. Our results show that NEWS is e#ective in both network- and server-limited scenarios. Second, we evaluate the run-time cost of NEWS tra#c monitoring in practice, and find that it consumes little CPU time and relatively small memory. Finally, we extend core NEWS algorithms to include a simple hot-spot identification function to protect bystander tra#c from flash crowds e#ciently.
Effectiveness of Dynamic Resource Allocation for Handling Internet Flash Crowds
, 2003
"... Internet data centers host multiple Web applications on shared hardware resources. These data centers are typically provisioned to meet the expected peak demands of the hosted applications based on normal time-of-day effects. Such an over-provisioning approach is not robust to flash crowd scenario ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Internet data centers host multiple Web applications on shared hardware resources. These data centers are typically provisioned to meet the expected peak demands of the hosted applications based on normal time-of-day effects. Such an over-provisioning approach is not robust to flash crowd scenarios, where the load increase of some hosted applications is much higher than their expected peak loads. In such scenarios, data centers can utilize their resources better by employing dynamic resource allocation. In this paper, we present a prototype data center implementation that we use to study the effectiveness of dynamic resource allocation for handling flash crowds with different characteristics. This prototype implements a multi-tiered server architecture along with mechanisms for monitoring, load detection, load balancing and dynamic allocation. Our experiments with this prototype show that a carefully designed dynamic allocation scheme can be effective for handling flash crowds. We show that in order to handle very sharp growth in loads, a dynamic allocation scheme must be either extremely responsive or employ low overhead mechanisms such as using hot spare servers. On the other hand, gradually increasing flash crowds can be handled equally well with larger overheads and slower reaction times. We also show that even in the presence of large allocation overhead, it is possible to achieve the same application performance by either allocating multiple servers simultaneously or allocating a few servers often. Using our results, we conclude that even without large-scale over-provisioning, it is possible to effectively handle flash crowd conditions using a dynamic allocation scheme that responds quickly to workload changes, and that can mask large allocation ove...
Guardian: A Router Mechanism for Extreme Overload Prevention
- Proceedings of Scalability and Traffic Control in IP Networks
, 2002
"... Disasters such as the 9/11 attacks, as well as major and unpredictable events, can cause extreme network overload. By "extreme overload" we mean, first, that the offered load at a link is significantly higher than the link's capacity, and second, that the average throughput per session is too low. U ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Disasters such as the 9/11 attacks, as well as major and unpredictable events, can cause extreme network overload. By "extreme overload" we mean, first, that the offered load at a link is significantly higher than the link's capacity, and second, that the average throughput per session is too low. Under such conditions, the network can suffer from a form of "livelock" in which even though links are fully utilized, most users cannot complete their transfers. The underlying reasons are that the network carries many retransmitted packets, and that it services flows that are finally aborted by users or applications.
The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements
"... Elasticity of cloud computing environments provides an economic incentive for automatic resource allocation of stateful systems running in the cloud. However, these systems have to meet strict performance Service-Level Objectives (SLOs) expressed using upper percentiles of request latency, such as t ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Elasticity of cloud computing environments provides an economic incentive for automatic resource allocation of stateful systems running in the cloud. However, these systems have to meet strict performance Service-Level Objectives (SLOs) expressed using upper percentiles of request latency, such as the 99th. Such latency measurements are very noisy, which complicates the design of the dynamic resource allocation. We design and evaluate the SCADS Director, a control framework that reconfigures the storage system on-the-fly in response to workload changes using a performance model of the system. We demonstrate that such a framework can respond to both unexpected data hotspots and diurnal workload patterns without violating strict performance SLOs. 1
A Collaborative Approach to Stochastic Load Balancing with Networked Queues of Autonomous Service
"... Abstract—Load balancing has been an increasingly important issue for handling computational intensive tasks in a distributed system such as in Grid and cluster computing. In such systems, multiple server instances are installed for handling requests from client applications, and each request (or tas ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—Load balancing has been an increasingly important issue for handling computational intensive tasks in a distributed system such as in Grid and cluster computing. In such systems, multiple server instances are installed for handling requests from client applications, and each request (or task) typically needs to stay in a queue before an available server is assigned to process it. In this paper, we propose a high-performance queueing method for implementing a shared queue for collaborative clusters of servers. Each cluster of servers maintains a local queue and queues of different clusters are networked to form a unified (or shared) queue that may dispatch tasks to all available servers. We propose a new randomized algorithm for forwarding requests in an overcrowded local queue to a networked queue based on load information of the local and neighboring clusters. The algorithm achieves both load balancing and locality awareness. I.
Advanced RDMA-based Admission Control for Modern Data-Centers
"... Abstract—Current data-centers employ admission control mechanism to maintain low response time and high throughput under overloaded scenarios. Existing mechanisms use internal (on the overloaded server) or external (on the front-end proxies) admission control approaches. The external admission contr ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—Current data-centers employ admission control mechanism to maintain low response time and high throughput under overloaded scenarios. Existing mechanisms use internal (on the overloaded server) or external (on the front-end proxies) admission control approaches. The external admission control is preferred since it can be performed transparently without any modifications to the overloaded server and global decisions can be made based on the load information of all the back-end servers. However, external admission control mechanisms are bound to use TCP/IP communication protocol to get the load information from the back-end servers and rely on coarse-grained load monitoring due to the overheads associated with fine-grained load monitoring. In this paper, we provide a fine-grained external admission control mechanism by leveraging the one-sided RDMA feature of high-speed interconnects and consequently provide superior performance, response time guarantees and overload control in a data-center environment. Our design is implemented over InfiniBand-based clusters working in conjunction with Apache based servers. Experimental evaluations with single file, world cup and zipf traces show that our admission control can improve the response time by up to 28%, 17 % and 23%, respectively, as compared to performing TCP/IP-based admission control and 51%, 36 % and 42%, respectively, as compared to the base performance without any admission control. Further, our evaluations also show that RDMA-based admission control mechanism can provide better QoS guarantees as compared to TCP/IP-based admission control and no admission control approaches. I.
Web Content Adaptation for Peak Loads Trading Quality for Performance by
"... This work addresses a need of a web site to serve unpredictably high load volumes, as caused by a Slashdot effect or other exceptional event. We focus on content adaptation methods which reduce the quality of a page content, as a tradeoff for a boost in performance of the server, in terms of number ..."
Abstract
- Add to MetaCart
This work addresses a need of a web site to serve unpredictably high load volumes, as caused by a Slashdot effect or other exceptional event. We focus on content adaptation methods which reduce the quality of a page content, as a tradeoff for a boost in performance of the server, in terms of number of concurrent users it can successfully serve. Based on estimation of the costs of each type of server’s activity, defined as utilization of various hardware and system resources, we develop a strategy for effective optimization of the site. Main components of our optimizations are: reduction in the number of HTTP requests as result of eliminating some of embedded objects, and reduction in the number of transmitted bytes per page, which is achieved by compression of graphical and other media content. We then build an experimental setup to validate the effectiveness of proposed optimization methods, where we compare the performance of Apache web server when serving the original page content versus the content in two levels of optimization. Results show a reduction of 91 % in the average response time for a single transaction, and an increase of 340 % in number of user transactions per second that could be successfully served. 1

