Results 1 - 10
of
126
Resource Containers: A New Facility for Resource Management in Server Systems
, 1999
"... General-purpose operating systems provide inadequate support for resource management in large-scale servers. Applications lack sufficient control over scheduling and management of machine resources, which makes it difficult to enforce priority policies, and to provide robust and controlled service. ..."
Abstract
-
Cited by 391 (9 self)
- Add to MetaCart
General-purpose operating systems provide inadequate support for resource management in large-scale servers. Applications lack sufficient control over scheduling and management of machine resources, which makes it difficult to enforce priority policies, and to provide robust and controlled service. There is a fundamental mismatch between the original design assumptions underlying the resource management mechanisms of current general-purpose operating systems, and the behavior of modern server applications. In particular, the operating system's notions of protection domain and resource principal coincide in the process abstraction. This coincidence prevents a process that manages large numbers of network connections, for example, from properly allocating system resources among those connections. We propose and evaluate a new operating system abstraction called a resource container, which separates the notion of a protection domain from that of a resource principal. Resource containers ...
Flash: An efficient and portable Web server
, 1999
"... This paper presents the design of a new Web server architecture called the asymmetric multiprocess event-driven (AMPED) architecture, and evaluates the performance of an implementation of this architecture, the Flash Web server. The Flash Web server combines the high performance of single-process ev ..."
Abstract
-
Cited by 240 (23 self)
- Add to MetaCart
This paper presents the design of a new Web server architecture called the asymmetric multiprocess event-driven (AMPED) architecture, and evaluates the performance of an implementation of this architecture, the Flash Web server. The Flash Web server combines the high performance of single-process event-driven servers on cached workloads with the performance of multi-process and multithreaded servers on disk-bound workloads. Furthermore, the Flash Web server is easily portable since it achieves these results using facilities available in all modern operating systems. The performance of different Web server architectures is evaluated in the context of a single implementation in order to quantify the impact of a server's concurrency architecture on its performance. Furthermore, the performance of Flash is compared with two widely-used Web servers, Apache and Zeus. Results indicate that Flash can match or exceed the performance of existing Web servers by up to 50 % across a wide range of real workloads. We also present results that show the contribution of various optimizations embedded in Flash.
Application performance and flexibility on Exokernel systems
- In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles
, 1997
"... The exokernel operating system architecture safely gives untrusted software efficient control over hardware and software resources by separating management from protection. This paper describes an exokernel system that allows specialized applications to achieve high performance without sacrificing t ..."
Abstract
-
Cited by 168 (9 self)
- Add to MetaCart
The exokernel operating system architecture safely gives untrusted software efficient control over hardware and software resources by separating management from protection. This paper describes an exokernel system that allows specialized applications to achieve high performance without sacrificing the performance of unmodified UNIX programs. It evaluates the exokernel architecture by measuring end-to-end application performance on Xok, an exokernel for Intel x86-based computers, and by comparing Xok’s performance to the performance of two widely-used 4.4BSD UNIX systems (Free-BSD and OpenBSD). The results show that common unmodified UNIX applications can enjoy the benefits of exokernels: applications either perform comparably on Xok/ExOS and the BSD UNIXes, or perform significantly better. In addition, the results show that customized applications can benefit substantially from control over their resources (e.g., a factor of eight for a Web server). This paper also describes insights about the exokernel approach gained through building three different exokernel systems, and presents novel approaches to resource multiplexing. 1
Scalable kernel performance for Internet servers under realistic loads
, 1998
"... UNIX Internet servers with an event-driven architecture often perform poorly under real workloads, even if they perform well under laboratory benchmarking conditions. We investigated the poor performance of event-driven servers. We found that the delays typical in wide-area networks cause busy serve ..."
Abstract
-
Cited by 86 (9 self)
- Add to MetaCart
UNIX Internet servers with an event-driven architecture often perform poorly under real workloads, even if they perform well under laboratory benchmarking conditions. We investigated the poor performance of event-driven servers. We found that the delays typical in wide-area networks cause busy servers to manage a large number of simultaneous connections. We also observed that the select system call implementation in most UNIX kernels scales poorly with the number of connections being managed by a process. The UNIX algorithm for allocating file descriptors also scales poorly. These algorithmic problems lead directly to the poor performance of event-driven servers. We implemented scalable versions of the select system call and the descriptor allocation algorithm. This led to an improvement of up to 58% in Web proxy and Web server throughput, and dramatically improved the scalability of the system.
Mondrian Memory Protection
, 2002
"... Mondrian memory protection (MMP) is a fine-grained protection scheme that allows multiple protection domains to flexibly share memory and export protected services. In contrast to earlier pagebased systems, MMP allows arbitrary permissions control at the granularity of individual words. We use a com ..."
Abstract
-
Cited by 82 (1 self)
- Add to MetaCart
Mondrian memory protection (MMP) is a fine-grained protection scheme that allows multiple protection domains to flexibly share memory and export protected services. In contrast to earlier pagebased systems, MMP allows arbitrary permissions control at the granularity of individual words. We use a compressed permissions table to reduce space overheads and employ two levels of permissions caching to reduce run-time overheads. The protection tables in our implementation add less than 9% overhead to the memory space used by the application. Accessing the protection tables adds less than 8% additional memory references to the accesses made by the application. Although it can be layered on top of demandpaged virtual memory, MMP is also well-suited to embedded systems with a single physical address space. We extend MMP to support segment translation which allows a memory segment to appear at another location in the address space. We use this translation to implement zero-copy networking underneath the standard read system call interface, where packet payload fragments are connected together by the translation system to avoid data copying. This saves 52% of the memory references used by a traditional copying network stack.
Capturing, indexing, clustering, and retrieving system history
- In SOSP
, 2005
"... system performance, Bayesian networks, information retrieval, problem signatures We present a method for automatically extracting from a running system an indexable signature that distills the essential characteristic from a system state and that can be subjected to automated clustering and similari ..."
Abstract
-
Cited by 65 (5 self)
- Add to MetaCart
system performance, Bayesian networks, information retrieval, problem signatures We present a method for automatically extracting from a running system an indexable signature that distills the essential characteristic from a system state and that can be subjected to automated clustering and similarity-based retrieval to identify when an observed system state is similar to a previously-observed state. This allows operators to identify and quantify the frequency of recurrent problems, to leverage previous diagnostic efforts, and to establish whether problems seen at different installations of the same site are similar or distinct. We show that the naive approach to constructing these signatures based on simply recording the actual "raw " values of collected measurements is
A Performance Evaluation of Hyper Text Transfer Protocols
- IN PROCEEDINGS OF ACM SIGMETRICS ’99
, 1998
"... Version 1.1 of the Hyper Text Transfer Protocol (HTTP) was principally developed as a means for reducing both document transfer latency and network traffic. The rationale for the performance enhancements in HTTP/1.1 is based on the assumption that the network is the bottleneck in Web transactions ..."
Abstract
-
Cited by 60 (4 self)
- Add to MetaCart
Version 1.1 of the Hyper Text Transfer Protocol (HTTP) was principally developed as a means for reducing both document transfer latency and network traffic. The rationale for the performance enhancements in HTTP/1.1 is based on the assumption that the network is the bottleneck in Web transactions. In practice, however, the Web server can be the primary source of document transfer latency. In this paper, we characterize and compare the performance of HTTP/1.0 and HTTP/1.1 in terms of throughput at the server and transfer latency at the client. Our approach is based on considering a broader set of bottlenecks in an HTTP transfer# we examine how bottlenecks in the network, CPU, and in the disk system affect the relative performance of HTTP/1.0 versus HTTP/1.1. We show that the network demands under HTTP/1.1 are somewhat lower than HTTP/1.0, and we quantify those differences in terms of packets transferred, server congestion window size and data bytes per packet. We show that when the CPU is the bottleneck, there is relatively little difference in performance between HTTP/1.0 and HTTP/1.1. Surprisingly, we show that when the disk system is the bottleneck, performance using HTTP/1.1 can be much worse than with HTTP/1.0. Based on these observations, we suggest a connection management policy for HTTP/1.1 that can improve throughput, decrease latency, and keep network traffic low when the disk system is the bottleneck.
Optimizing network virtualization in xen
- In Proceedings of the USENIX Annual Technical Conference
, 2006
"... In this paper, we propose and evaluate three techniques for optimizing network performance in the Xen virtualized environment. Our techniques retain the basic Xen architecture of locating device drivers in a privileged ‘driver ’ domain with access to I/O devices, and providing network access to unpr ..."
Abstract
-
Cited by 60 (4 self)
- Add to MetaCart
In this paper, we propose and evaluate three techniques for optimizing network performance in the Xen virtualized environment. Our techniques retain the basic Xen architecture of locating device drivers in a privileged ‘driver ’ domain with access to I/O devices, and providing network access to unprivileged ‘guest ’ domains through virtualized network interfaces. First, we redefine the virtual network interfaces of guest domains to incorporate high-level network offfload features available in most modern network cards. We demonstrate the performance benefits of high-level offload functionality in the virtual interface, even when such functionality is not supported in the underlying physical interface. Second, we optimize the implementation of the data transfer path between guest and driver domains. The optimization avoids expensive data remapping operations on the transmit path, and replaces page remapping by data copying on the receive path. Finally, we provide support for guest operating systems to effectively utilize advanced virtual memory features such as superpages and global page mappings. The overall impact of these optimizations is an improvement in transmit performance of guest domains by a factor of 4.4. The receive performance of the driver domain is improved by 35 % and reaches within 7 % of native Linux performance. The receive performance in guest domains improves by 18%, but still trails the native Linux performance by 61%. We analyse the performance improvements in detail, and quantify the contribution of each optimization to the overall performance. 1
Efficient support for P-HTTP in cluster-based Web servers
- IN IN PROCEEDINGS OF USENIX'99 TECHNICAL CONFERENCE
, 1999
"... This paper studies mechanisms and policies for supporting HTTP/1.1 persistent connections in cluster-based Web servers that employ content-based request distribution. We present two mechanisms for the efficient, content-based distribution of HTTP/1.1 requests among the back-end nodes of a cluster se ..."
Abstract
-
Cited by 55 (5 self)
- Add to MetaCart
This paper studies mechanisms and policies for supporting HTTP/1.1 persistent connections in cluster-based Web servers that employ content-based request distribution. We present two mechanisms for the efficient, content-based distribution of HTTP/1.1 requests among the back-end nodes of a cluster server. A trace-driven simulation shows that these mechanisms, combined with an extension of the locality-aware request distribution (LARD) policy, are effective in yielding scalable performance for HTTP/1.1 requests. We implemented the simpler of these two mechanisms, back-end forwarding. Measurements of this mechanism in connection with extended LARD on a prototype cluster, driven with traces from actual Web servers, con rm the simulation results. The throughput of the prototype is up to four times better than that achieved by conventional weighted round-robin request distribution. In addition, throughput with persistent connections is up to 26 % better than without.
Measuring the Capacity of a Web Server under Realistic Loads
- World Wide Web Journal (Special Issue on World Wide Web Characterization and Performance Evaluation
, 1999
"... The World Wide Web and its related applications place substantial performance demands on network servers. The ability to measure the effect of these demands is important for tuning and optimizing the various software components that make up a Web server. To measure these effects, it is necessary to ..."
Abstract
-
Cited by 48 (7 self)
- Add to MetaCart
The World Wide Web and its related applications place substantial performance demands on network servers. The ability to measure the effect of these demands is important for tuning and optimizing the various software components that make up a Web server. To measure these effects, it is necessary to generate realistic HTTP client requests in a test-bed environment. Unfortunately, the state-of-the-art approach for benchmarking Web servers is unable to generate client request rates that exceed the capacity of the server being tested, even for short periods of time. Moreover, it fails to model important characteristics of the wide area networks on which most servers are deployed (e.g. delay and packet loss). This paper examines pitfalls that one encounters when measuring Web server capacity using a synthetic workload. We propose and evaluate a new method for Web traffic generation that can generate bursty traffic, with peak loads that exceed the capacity of the server. Our method also mod...

