Results 1 - 10
of
36
An Analysis of TCP Processing Overhead
- IEEE Communications Magazine
, 1989
"... networks, have been getting faster, perceived throughput at the application has not always increased accordingly. Various performance bottlenecks have been encountered, each of which has to be analyzed and corrected. One aspect of networking often suspected of contributing to low throughput is the t ..."
Abstract
-
Cited by 303 (0 self)
- Add to MetaCart
networks, have been getting faster, perceived throughput at the application has not always increased accordingly. Various performance bottlenecks have been encountered, each of which has to be analyzed and corrected. One aspect of networking often suspected of contributing to low throughput is the transport layer of the protocol suite. This layer, especially in connectionless protocols, has considerable functionality, and is typically executed in software by the host processor at the end points of the network. It is thus a likely source of processing overhead. While this theory is appealing, a preliminary examination suggested to us that other aspects of networking may be a more serious source of overhead. To test this proposition, a detailed study was made of a popular transport protocol, Transmission Control Protocol (TCP) [I]. This paper provides results of that
Implementing Network Protocols at User Level
, 1993
"... Traditionally, network software hasbeen structured in a monolithic fashion with all protocol stacks executing either within the kernel or in a single trusted user-level server. This organization is motivated by performance and security concerns. However, considerations of code maintenance, ease of d ..."
Abstract
-
Cited by 135 (1 self)
- Add to MetaCart
Traditionally, network software hasbeen structured in a monolithic fashion with all protocol stacks executing either within the kernel or in a single trusted user-level server. This organization is motivated by performance and security concerns. However, considerations of code maintenance, ease of debugging, customization, and the simultaneous existence of multiple protocols argue for separating the implementations into more manageable user-level libraries of protocols. This paper describes the design and implementation of transport protocols as user-level libraries. We begin by motivating the need for protocol implementations as user-level libraries and placing our approachin the context of previous work. We then describe our alternative to monolithic protocol organization, which has been implemented on Mach workstations connected not only to traditional Ethernet, but also to a more modern network, the DEC SRC AN1. Based on our experience, we discuss the implications for host-network ...
Soft Timers: Efficient Microsecond Software Timer Support for Network Processing
- In Proc. of the 17th Symp. on Operating Systems Principles
, 1999
"... This paper proposes and evaluates soft timers, a new operating system facility that allows the efficient scheduling of software events at a granularity down to tens of microseconds. Soft timers can be used to avoid interrupts and reduce context switches associated with network processing, without sa ..."
Abstract
-
Cited by 70 (1 self)
- Add to MetaCart
This paper proposes and evaluates soft timers, a new operating system facility that allows the efficient scheduling of software events at a granularity down to tens of microseconds. Soft timers can be used to avoid interrupts and reduce context switches associated with network processing, without sacrificing low communication delays. More specifically, soft timers enable transport protocols like TCP to efficiently perform rate-based clocking of packet transmissions. Experiments indicate that soft timers allow a server to employ rate-based clocking with little CPU overhead (2–6%) at high aggregate bandwidths. Soft timers can also be used to perform network polling, which eliminates network interrupts and increases the memory access locality of the network subsystem without sacrificing delay. Experiments show that this technique can improve the throughput of a Web server by up to 25%.
Performance issues in parallelized network protocols
- In First USENIX Symposium on Operating Systems Design and Implementation
, 1994
"... Parallel processing has been proposed as a means of improving network protocol throughput. Several different strategies have been taken towards parallelizing protocols. A relatively popular approach is packet-level parallelism, where packets are distributed across processors. This paper provides an ..."
Abstract
-
Cited by 50 (11 self)
- Add to MetaCart
Parallel processing has been proposed as a means of improving network protocol throughput. Several different strategies have been taken towards parallelizing protocols. A relatively popular approach is packet-level parallelism, where packets are distributed across processors. This paper provides an experimental performance study of packet-level parallelism on a contemporary sharedmemory multiprocessor. We examine several unexplored areas in packet-level parallelism and investigate how various protocol structuring and implementation techniques can affect performance. We study TCP/IP and UDP/IP protocol stacks, implemented with a parallel version of the x-kernel running in user space on Silicon Graphics multiprocessors. Our results show that only limited packet-level parallelism can be achieved within a single connection under TCP, but that using multiple connections can improve available parallelism. We also demonstrate that packet ordering plays a key role in determining single-connection TCP performance, that careful use of locks is a necessity, and that selective exploitation of caching can improve throughput. We also describe experiments that compare parallel protocol performance on two generations of a parallel machine and show how computer architectural trends can influence performance. 1
Transport System Architecture Services for High-Performance Communications Systems
- IEEE Journal on Selected Areas in Communication
, 1993
"... Providing end-to-end gigabit communication support for high-bandwidth multimedia applications requires transport systems that transfer data efficiently via network protocols such as TCP, TP4, XTP, and STII. This paper describes and classifies transport system services that integrate operating syst ..."
Abstract
-
Cited by 36 (14 self)
- Add to MetaCart
Providing end-to-end gigabit communication support for high-bandwidth multimedia applications requires transport systems that transfer data efficiently via network protocols such as TCP, TP4, XTP, and STII. This paper describes and classifies transport system services that integrate operating system resources such as CPU(s), virtual memory, and I/O devices together with network protocols to support distributed multimedia applications running on local and wide area networks. A taxonomy is presented that compares and evaluates four commercial and experimental transport systems in terms of their protocol processing support. The systems covered in this paper include System V UNIX STREAMS, the BSD UNIX networking subsystem, the x-kernel, and the Choices Conduit system. This paper is intended to navigate researchers and developers through the transport system design space by describing alternative approaches for key transport system services. 1 Introduction Transport systems integ...
TCP: Improving Startup Dynamics by Adaptive Timers and Congestion Control
, 1998
"... This paper studies the startup dynamics of TCP on both high as well as low bandwidthdelay network paths and proposes a set of enhancements that improve both the latency as well as throughput of relatively short TCP transfers. Numerous studies have shown that the timer and congestion control mechanis ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
This paper studies the startup dynamics of TCP on both high as well as low bandwidthdelay network paths and proposes a set of enhancements that improve both the latency as well as throughput of relatively short TCP transfers. Numerous studies have shown that the timer and congestion control mechanisms in TCP can have a limiting effect on performance in the startup phase. Based on the results of our study, we propose mechanisms for adapting TCP in order to yield increased performance. First, we propose a framework for the management of timing in TCP. Second, we show how TCP can utilize the proposed timer framework to reduce the overly conservative delay associated with a retransmission timeout. Third, we propose the use of packet pacing in the initial slow-start to improve the performance of relatively short transfers that characterize the web traffic. Finally, we quantify the importance of estimating the initial slow-start threshold in TCP, specially on high bandwidth-delay paths. 1 In...
Speedup vs. Simulation Granularity
- IEEE/ACM Transactions on Networking
, 1996
"... This paper describes a packet network simulator whose timing granularity can shift continuously from fine, packetlevel detail to coarse, conversation-level detail. Simulation run time decreases with coarser timing granularity, but the details in the underlying model become faded as the timing granul ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
This paper describes a packet network simulator whose timing granularity can shift continuously from fine, packetlevel detail to coarse, conversation-level detail. Simulation run time decreases with coarser timing granularity, but the details in the underlying model become faded as the timing granularity coarsens. The finer the granularity, the slower but more precise the simulation. If a simulation becomes resource limited, it is possible to coarsen the timing granularity to scale the simulation larger. This paper introduces a new simulation technique to speedup simulation of high speed, wide area networks. The new technique can yield order of magnitude speedup and memory savings on simulations of large-scale packet networks. The speedup is achieved by introducing a degree of approximation into abstracting packet streams. We call this technique Flowsim. Flowsim can yield different simulation metrics than packet simulation due to its different degree of simulation granularity. We have...
TCP Implementation Enhancements for Improving Webserver Performance
, 1999
"... This paper studies the performance of BSD-based TCP implementations in Web servers. We find that lack of scalability with respect to high TCP connection rates reduces the throughput of Web servers by up to 25% and imposes a memory overhead of up to 32 MB on the kernel. We also find that insufficient ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper studies the performance of BSD-based TCP implementations in Web servers. We find that lack of scalability with respect to high TCP connection rates reduces the throughput of Web servers by up to 25% and imposes a memory overhead of up to 32 MB on the kernel. We also find that insufficient accuracy in TCP's timers results in overly conservative delays for retransmission timeouts, causing poor response time, low network utilization and throughput loss. The paper proposes enhancements to the TCP implementation that eliminate these problems, without requiring changes to the protocol or the API. We also find that conventional benchmark environments do not fully expose certain significant performance aspects of TCP implementations and propose techniques that allow these benchmarks to more accurately predict the performance of real servers. Keywords---Internet, TCP, Webserver, Timers I. INTRODUCTION With the widespread growth in the use of the World Wide Web (WWW), webservers are...
Implementation and Evaluation of the KOM RSVP Engine
- IN PROCEEDINGS OF THE 20TH ANNUAL JOINT CONFERENCE OF THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES (INFOCOM’2001
, 2001
"... In this paper, we describe implementation aspects and performance results of an innovative and publicly available RSVP implementation. Much debate exists about the applicability of RSVP as a signalling protocol in the Internet, particularly for a large number of unicast flows. While there has been a ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
In this paper, we describe implementation aspects and performance results of an innovative and publicly available RSVP implementation. Much debate exists about the applicability of RSVP as a signalling protocol in the Internet, particularly for a large number of unicast flows. While there has been a significant amount of work published on the theoretical concepts of RSVP signalling and conjectures about its presumed shortcomings, rather little attention has been paid to the implementation details of the core protocol engine. With our work, in spite of being still far from a final judgement, we try to shed light on this issue by presenting certain design details of a new implementation and a study about its performance. One particular result is given by the observation that a relatively cheap router based on PC hardware can sustain the signalling for more than 50,000 unicast flows.
Hashed and Hierarchical Timing Wheels: Efficient Data Structures for Implementing a Timer Facility
, 1996
"... Conventional algorithms to implement an Operating System timer module take O(n) time to start or maintain a timer, where n is the number of outstanding timers: this is expensive for large n. This paper shows that by using a circular buffer or timing wheel, it takes O(1) time to start, stop, and mai ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Conventional algorithms to implement an Operating System timer module take O(n) time to start or maintain a timer, where n is the number of outstanding timers: this is expensive for large n. This paper shows that by using a circular buffer or timing wheel, it takes O(1) time to start, stop, and maintain timers within the range of the wheel. Two extensions for larger values of the interval are described. In the first, the timer interval is hashed into a slot on the timing wheel. In the second, a hierarchy of timing wheels with different granularities is used to span a greater range of intervals. The performance of these two schemes and various implementation tradeoffs are discussed. We have used one of our schemes to replace the current BSD UNIX callout and timer facilities. Our new implementation can support thousands of outstanding timers without much overhead. Our timer schemes have also been implemented in other operating systems and network protocol packages. 1 Introduction In a ...

