Results 1 - 10
of
40
Eliminating receive livelock in an interrupt-driven kernel
- ACM Transactions on Computer Systems
, 1997
"... Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receiv ..."
Abstract
-
Cited by 241 (4 self)
- Add to MetaCart
Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low of-fered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms of receive livelock, in which the system spends all its time processing interrupts, to the exclusion of other neces-sary tasks. Under extreme conditions, no packets are delivered to the user application or the output of the system. To avoid livelock and related problems, an operat-ing system must schedule network interrupt handling as carefully as it schedules process execution. We modified an interrupt-driven networking implemen-tation to do so; this eliminates receive livelock without degrading other aspects of system performance. We present measurements demonstrating the success of our approach. 1.
Experiences with a High-Speed Network Adaptor: A Software Perspective
, 1994
"... This paper describes our experiences, from a software perspective, with the OSIRIS network adaptor. It first identifies the problems we encountered while programming OSIRIS and optimizing network performance, and outlines how we either addressed them in the software, or had to modify the hardware. I ..."
Abstract
-
Cited by 149 (10 self)
- Add to MetaCart
This paper describes our experiences, from a software perspective, with the OSIRIS network adaptor. It first identifies the problems we encountered while programming OSIRIS and optimizing network performance, and outlines how we either addressed them in the software, or had to modify the hardware. It then describes the opportunities provided by OSIRIS that we were able to exploit in the host operating system (OS); opportunities that suggested techniques for making the OS more effective in delivering network data to application programs. The most novel of these techniques, called application device channels, gives application programs running in user space direct access to the adaptor. The paper concludes with the lessons drawn from this work, which we believe will benefit the designers of future network adaptors. 1 Introduction With the emergence of high-speed network facilities, several research efforts are focusing on the design and implementation of network adaptors [5, 2, 3, 16, 2...
The APIC Approach to High Performance Network Interface Design: Protected DMA and Other Techniques
- IN PROCEEDINGS OF INFOCOM '97
, 1997
"... We are building a very high performance 1.2 Gb/s ATM network interface chip called the APIC (ATM Port Interconnect Controller). In addition to borrowing useful ideas from a number of research and commercial prototypes, the APIC design embraces several innovative features, and integrates all of these ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
We are building a very high performance 1.2 Gb/s ATM network interface chip called the APIC (ATM Port Interconnect Controller). In addition to borrowing useful ideas from a number of research and commercial prototypes, the APIC design embraces several innovative features, and integrates all of these pieces into a coherent whole. This paper describes some of the novel ideas that have been incorporated in the APIC design with a view to improving the bandwidth and latency seen by end-applications. Among the techniques described, Protected DMA and Protected I/O were designed to allow applications to queue data for transmission or reception directly from user-space, effectively bypassing the kernel. This argues for moving the entire protocol stack including the interface device driver into user-space, thereby yielding better latency and throughput performance than kernel-resident implementations. Pool DMA when used with Packet Splitting, is a technique that can be used to build true zero-co...
A Systematic Approach to Host Interface Design for High-Speed Networks
- IEEE Computer
, 1994
"... In recent years, networks with media rates of 100 Mbit/second or more have become widely available (FDDI, ATM, HIPPI, ..). However, many computer systems cannot make use of the available bandwidth because of the high overhead associated with network communication. In this paper we review the operati ..."
Abstract
-
Cited by 49 (5 self)
- Add to MetaCart
In recent years, networks with media rates of 100 Mbit/second or more have become widely available (FDDI, ATM, HIPPI, ..). However, many computer systems cannot make use of the available bandwidth because of the high overhead associated with network communication. In this paper we review the operations involved in communication over high-speed networks, and we describe optimizations of the network interface that improve network throughput. We also discuss how the payoff of the optimizations is influenced by features of the host software and architecture. This paper is based on our experience with the interfaces for the Nectar and Gigabit Nectar networks. Keywords: network interfaces, high-speed networks, buffer management, memory hierarchy This research was sponsored by the Defense Advanced Research Projects Agency (DOD) under contract number MDA972-90-C-0035, in part by the National Science Foundation and the Defense Advanced Research Projects Agency under Cooperative Agreement NCR-...
Software Support for Outboard Buffering and Checksumming
- In Proceedings of the ACM SIGCOMM ’95 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication
, 1995
"... Data copying and checksumming are the most expensive operations when doing high-bandwidth network IO over a highspeed network. Under some conditions, outboard buffering and checksumming can eliminate accesses to the data, thus making communication less expensive and faster. One of the scenarios in w ..."
Abstract
-
Cited by 34 (7 self)
- Add to MetaCart
Data copying and checksumming are the most expensive operations when doing high-bandwidth network IO over a highspeed network. Under some conditions, outboard buffering and checksumming can eliminate accesses to the data, thus making communication less expensive and faster. One of the scenarios in which outboard buffering pays off is the common case of applications accessing the network using the Berkeley sockets interface and the Internet protocol stack. In this paper we describe the changes that were made to a BSD protocol stack to make use of a network adaptor that supports outboard buffering and checksumming. Our goal is not only to achieve "single copy" communication for application that use sockets, but to also have efficient communication for in-kernel applications and for applications using other networks. Performance measurements show that for large reads and writes the single-copy path through the stack is significantly more efficient than the original implementation. 1 Intr...
Operating System Support for a Video-On-Demand File Service
, 1993
"... This paper describes the design and implementation of a continuous media file server intended for use in emerging video-on-demand applications. The main focus and contribution of the paper is in scheduling and admission control algorithms for accessing the server’s processor and storage resources. T ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
This paper describes the design and implementation of a continuous media file server intended for use in emerging video-on-demand applications. The main focus and contribution of the paper is in scheduling and admission control algorithms for accessing the server’s processor and storage resources. The scheduling algorithms support multiple classes of tasks with diverse performance requirements and allow for the co-existence of guaranteed real-time requests with sporadic, and unsolicited requests. The scheduler maintains performance guarantees for real-time streams in the presence of unpredictably varying nonreal-time traffic while ensuring system stability even during overloads. A prototype video file server was implemented on an Intel 486 platform. Performance results show that a large number of streams can be supported, while maintaining efficient utilization of system resources.
Improving TCP Throughput over Two-Way Asymmetric Links: Analysis and Solutions
, 1998
"... We study several schemes for improving the performance of two-way TCP traffic over asymmetric links where the bandwidths in the two directions may differ substantially, possibly by many orders of magnitude. The sharing of a common buffer by data segments and acknowledgments in such an environment pr ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
We study several schemes for improving the performance of two-way TCP traffic over asymmetric links where the bandwidths in the two directions may differ substantially, possibly by many orders of magnitude. The sharing of a common buffer by data segments and acknowledgments in such an environment produces the effect of ack compression, often causing dramatic reductions in throughput. We first demonstrate the significance of the problem by means of measurements on an experimental network and then proceed to study approaches to improve the throughput of the connections. These approaches reduce the effect of ack compression by carefully controlling the flow of data packets and acknowledgments. We first examine a scheme where acknowledgments are transmitted at a higher priority than data. By analysis and simulation, we show that prioritizing acks can lead to starvation of the low-bandwidth connection. The second approach makes use of a connection-level backpressure mechanism to limit the m...
SPINE: An operating system for intelligent network adapters
, 1998
"... Abstract: The emergence of fast, cheap embedded processors presents the opportunity for processing to occur on the network adapter. We are investigating how a system design incorporating such an intelligent network adapter can be used for applications that benefit from being tightly integrated with ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
Abstract: The emergence of fast, cheap embedded processors presents the opportunity for processing to occur on the network adapter. We are investigating how a system design incorporating such an intelligent network adapter can be used for applications that benefit from being tightly integrated with the network subsystem. We are developing a safe, extensible operating system, called SPINE, which enables applications to compute directly on the network adapter. We demonstrate the feasibility of our approach with two applications: a video client and an Internet Protocol router. As a result of our system structure, image data is transferred only once over the I/O bus and places no load on the host CPU to display video at aggregate rates exceeding 100 Mbps. Similarly, the IP router can forward roughly 10,000 packets per second on each network adapter, while placing no load on the host CPU. Based on our experiences, we describe three hardware features useful for improving performance. Finally, we conclude that offloading work to the network adapter can make sense, even using current embedded processor technology. 1
Operating System Support for High-Speed Communication
- Communications of the ACM
, 1996
"... This paper looks at the I/O bottleneck in operating systems, with particular focus on high-speed networking. We start by identifying the causes of this bottleneck, which are rooted in a mismatch of operating system behavior with the performance characteristics of modern computer hardware. Then, trad ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
This paper looks at the I/O bottleneck in operating systems, with particular focus on high-speed networking. We start by identifying the causes of this bottleneck, which are rooted in a mismatch of operating system behavior with the performance characteristics of modern computer hardware. Then, traditional approaches to supporting I/O in operating systems are re-evaluated in light of current hardware performance tradeoffs. This re-evaluation gives rise to a set of novel techniques that eliminate the I/O bottleneck. The root cause of the OS I/O bottleneck is that speed improvements of main memory have lagged behind those of the central processing unit (CPU) and I/O devices during the past decade [6]. In state-of-the-art computer systems, the bandwidth of main memory is orders of magnitude lower than the bandwidth of the CPU, and the bandwidths of the fastest I/O devices approach that of main memory

