Results 1 - 10
of
32
Fbufs: A High-Bandwidth Cross-Domain Transfer Facility
- in Proceedings of the Fourteenth ACM symposium on Operating Systems Principles
, 1993
"... We have designed and implemented a new operating system facility for I/O buffer management and data transfer across protection domain boundaries on shared memory machines. This facility, called fast buffers (fbufs), combines virtual page remapping with shared virtual memory, and exploits locality in ..."
Abstract
-
Cited by 290 (15 self)
- Add to MetaCart
We have designed and implemented a new operating system facility for I/O buffer management and data transfer across protection domain boundaries on shared memory machines. This facility, called fast buffers (fbufs), combines virtual page remapping with shared virtual memory, and exploits locality in I/O traffic to achieve high throughput withoutcompromising protection, security, or modularity. Its goal is to help deliver the high bandwidth afforded by emerging high-speed networks to user-level processes, both in monolithic and microkernel-based operating systems. This paper outlines the requirements for a cross-domain transfer facility, describes the design of the fbuf mechanism that meets these requirements, and experimentally quantifies the impact of fbufs on network performance. 1 Introduction Optimizing operations that cross protection domain boundaries has received a great deal of attention recently [2, 3]. This is because an efficient cross-domain invocation facility enables a ...
Experiences with a High-Speed Network Adaptor: A Software Perspective
, 1994
"... This paper describes our experiences, from a software perspective, with the OSIRIS network adaptor. It first identifies the problems we encountered while programming OSIRIS and optimizing network performance, and outlines how we either addressed them in the software, or had to modify the hardware. I ..."
Abstract
-
Cited by 149 (10 self)
- Add to MetaCart
This paper describes our experiences, from a software perspective, with the OSIRIS network adaptor. It first identifies the problems we encountered while programming OSIRIS and optimizing network performance, and outlines how we either addressed them in the software, or had to modify the hardware. It then describes the opportunities provided by OSIRIS that we were able to exploit in the host operating system (OS); opportunities that suggested techniques for making the OS more effective in delivering network data to application programs. The most novel of these techniques, called application device channels, gives application programs running in user space direct access to the adaptor. The paper concludes with the lessons drawn from this work, which we believe will benefit the designers of future network adaptors. 1 Introduction With the emergence of high-speed network facilities, several research efforts are focusing on the design and implementation of network adaptors [5, 2, 3, 16, 2...
Limits to Low-Latency Communication on High-Speed Networks
- ACM Transactions on Computer Systems
, 1993
"... The throughput of local area networks is rapidly increasing. For example, the bandwidth of new ATM networks and FDDI token rings is an order of magnitude greater than that of Ethernets. Other network technologies promise a bandwidth increase of yet another order of magnitude in a few years. However, ..."
Abstract
-
Cited by 95 (3 self)
- Add to MetaCart
The throughput of local area networks is rapidly increasing. For example, the bandwidth of new ATM networks and FDDI token rings is an order of magnitude greater than that of Ethernets. Other network technologies promise a bandwidth increase of yet another order of magnitude in a few years. However, in distributed systems, lowered latency rather than increased throughput is often of primary concern. This paper examines the system-level effects of newer high-speed network technologies on low-latency, cross-machine communications. To evaluate a number of influences, both hardware and software, we designed and imple-mented a new remote procedure call system targeted at providing low latency. We then ported this system to several hardware platforms (DECstation and SPARCstation) with several differ-ent networks and controllers (ATM, FDDI, and Ethernet). Comparing these systems allows us to explore the performance impact of alternative designs in the communication system with respect to achieving low latency, e.g., the network, the network controller, the host architecture and cache system, and the kernel and user-level runtime software. Our RPC system, which achieves substantially reduced call times (170 pseconds on an ATM network using DECstation 5000/200 hosts), allow us to isolate those components of next-
Hardware/Software Organization of a High Performance ATM Host Interface
- IEEE Journal on Selected Areas in Communications
, 1993
"... Concurrent increases in network bandwidths and processor speeds have created a performance bottleneck at the workstation-to-network host interface . This is especially true for BISDN networks where the fixed length ATM cell is mismatched with application requirements for data transfer; a successful ..."
Abstract
-
Cited by 65 (14 self)
- Add to MetaCart
Concurrent increases in network bandwidths and processor speeds have created a performance bottleneck at the workstation-to-network host interface . This is especially true for BISDN networks where the fixed length ATM cell is mismatched with application requirements for data transfer; a successful hardware/software architecture will resolve such differences and offer high end-to-end performance. The solution we report carefully splits protocol processing functions into hardware and software implementations. The interface hardware is highly parallel and performs all per-cell functions with dedicated logic to maximize performance. Software provides support for the transfer of data between the interface and application memory, as well as the state management necessary for virtual circuit setup and maintenance. In addition, all higher level protocol processing is implemented with host software. The prototype connects an IBM RISC System/6000 to a SONET-based ATM network carrying data at th...
SPINE: An operating system for intelligent network adapters
, 1998
"... Abstract: The emergence of fast, cheap embedded processors presents the opportunity for processing to occur on the network adapter. We are investigating how a system design incorporating such an intelligent network adapter can be used for applications that benefit from being tightly integrated with ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
Abstract: The emergence of fast, cheap embedded processors presents the opportunity for processing to occur on the network adapter. We are investigating how a system design incorporating such an intelligent network adapter can be used for applications that benefit from being tightly integrated with the network subsystem. We are developing a safe, extensible operating system, called SPINE, which enables applications to compute directly on the network adapter. We demonstrate the feasibility of our approach with two applications: a video client and an Internet Protocol router. As a result of our system structure, image data is transferred only once over the I/O bus and places no load on the host CPU to display video at aggregate rates exceeding 100 Mbps. Similarly, the IP router can forward roughly 10,000 packets per second on each network adapter, while placing no load on the host CPU. Based on our experiences, we describe three hardware features useful for improving performance. Finally, we conclude that offloading work to the network adapter can make sense, even using current embedded processor technology. 1
VISA: Netstation's Virtual Internet SCSI Adapter
- in Proceedings of the 8th Symposium on Architectural Support for Programming Languages and Operating Systems
, 1998
"... In this paper we describe the implementation of VISA, our Virtual Internet SCSI Adapter. VISA was built to evaluate the performance impact on the host operating system of using IP to communicate with peripherals, especially storage devices. We have built and benchmarked file systems on VISA-attached ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
In this paper we describe the implementation of VISA, our Virtual Internet SCSI Adapter. VISA was built to evaluate the performance impact on the host operating system of using IP to communicate with peripherals, especially storage devices. We have built and benchmarked file systems on VISA-attached emulated disk drives using UDP/IP. By using IP, we expect to take advantage of its scaling characteristics and support for heterogeneous media to build large, long-lived systems. Detailed file system and network CPU utilization and performance data indicate that it is possible for UDP/IP to reach more than 80% of SCSI's maximum throughput without the use of network coprocessors. We conclude that IP is a viable alternative to special-purpose storage network protocols, and presents numerous advantages. 1 Introduction Storage system architectures are increasingly network-oriented, exploiting the ubiquity of networks to replace the direct host channel. Peripherals attached directly to network...
Multiplexing Traffic at the Entrance to Wide-Area Networks
, 1992
"... Many application-level traffic streams, or conversations, are multiplexed at the points where local-area networks meet the wide-area portion of an internetwork. Multiplexing policies and mechanisms acting at these points should provide good performance to each conversation, allocate network resource ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Many application-level traffic streams, or conversations, are multiplexed at the points where local-area networks meet the wide-area portion of an internetwork. Multiplexing policies and mechanisms acting at these points should provide good performance to each conversation, allocate network resources fairly among conversations, and make efficient use of network resources. In order to characterize wide-area network traffic, we have analyzed traces from four Internet sites. We identify characteristics common to all conversations of each major type of traffic, and find that these characteristics are stable across time and geographic site. Our results contradict many prevalent beliefs. For example, previous simulation models of wide-area traffic have assumed bulk transfers ranging from 80 Kilobytes to 2 Megabytes of data. In contrast, we find that up to 90% of all bulk transfers involve 10 Kilobytes or less. This and other findings may affect results of previous studies and should be taken into account in future models of wide-area traffic. We derive from our traces a new workload model for driving simulations of wide-area internetworks. It generates traffic for individual conversations of each major type of traffic. The model accurately and efficiently reproduces behavior specific to each traffic type by sampling measured probability distributions through the inverse transform method. Our model is valid for network conditions other than those prevalent during the measurements because it samples only network-independent traffic characteristics. We also describe a new wide-area internetwork simulator that includes both our workload model and realistic models of network components. We then present a simulation study of policies for multiplexing datagrams over virtual circu...
Increasing Communication Performance with a Minimal-Copy Data Path Supporting ILP and ALF
, 1996
"... Many current implementations of communication subsystems on workstation class computers transfer communication data to and from primary memory several times. This is due to software copying between user and operating system address spaces, presentation layer data conversion and other data manipulati ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Many current implementations of communication subsystems on workstation class computers transfer communication data to and from primary memory several times. This is due to software copying between user and operating system address spaces, presentation layer data conversion and other data manipulation functions. The consequence is that memory bandwidth is one of the major performance bottlenecks limiting high speed communication on these systems. We propose a communication subsystem architecture with a minimal -copy data path to widen this bottleneck. The architecture is tailored for protocol implementations using Integrated Layer Processing (ILP) and Application Layer Framing (ALF). We choose to implement these protocols in the address space of the application program. We present a new application program interface (API) between the protocols and the communication service in the operating system kernel. The API does not copy data, but instead passes pointers to page size data buffers....

