Results 1 -
6 of
6
Active Messages: a Mechanism for Integrated Communication and Computation
, 1992
"... The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high com ..."
Abstract
-
Cited by 911 (72 self)
- Add to MetaCart
The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high communication costs. Research prototypes of message driven machines demonstrate low communication overhead, but poor processor cost/performance. We introduce a simple communication mechanism, Active Messages, show that it is intrinsic to both architectures, allows cost effective use of the hardware, and offers tremendous flexibility. Implementations on nCUBE/2 and CM-5 are described and evaluated using a split-phase shared-memory extension to C, Split-C.We further show that active messages are sufficient to implement the dynamically scheduled languages for which message driven machines were designed. With this mechanism, latency tolerance becomes a programming/compiling concern. Hardware suppor...
Extensibility, safety and performance in the SPIN operating system
, 1995
"... This paper describes the motivation, architecture and performance of SPIN, an extensible operating system. SPIN provides an extension infrastructure, together with a core set of extensible services, that allow applications to safely change the operating system's interface and implementation. Extensi ..."
Abstract
-
Cited by 392 (14 self)
- Add to MetaCart
This paper describes the motivation, architecture and performance of SPIN, an extensible operating system. SPIN provides an extension infrastructure, together with a core set of extensible services, that allow applications to safely change the operating system's interface and implementation. Extensions allow an application to specialize the underlying operating system in order to achieve a particular level of performance and functionality. SPIN uses language and link-time mechanisms to inexpensively export ne-grained interfaces to operating system services. Extensions are written in a type safe language, and are dynamically linked into the operating system kernel. This approach o ers extensions rapid access to system services, while protecting the operating system code executing within the kernel address space. SPIN and its extensions are written in Modula-3 and run on DEC Alpha workstations. 1
Hamlyn - an Interface for sender-based communications
, 1992
"... This paper uses a characterization of three different types of interconnect traffic to drive the development of an innovative high-speed interconnect interface. This uses sender-controlled message placement at the recipient, which has the effect of greatly reducing the cost and complexity of message ..."
Abstract
-
Cited by 38 (5 self)
- Add to MetaCart
This paper uses a characterization of three different types of interconnect traffic to drive the development of an innovative high-speed interconnect interface. This uses sender-controlled message placement at the recipient, which has the effect of greatly reducing the cost and complexity of message handling. The contributions of this work are in (a) elucidating the traffic model; (b) in defining the sender-driven communication scheme; and (c) in the detailed description of an efficient, protected interface to the interconnect hardware that allows applications running in nonprivileged mode to access the interconnect directly, without operating system intervention. This version of the paper contains a complete high-level design for the first version of Hamlyn---a hardware interface that accommodates all the Hamlyn functionality. Future work on the protocol stacks and implementation work will doubtless improve and modify this interface. Until then, this description serves as a functionally complete snapshot of the Hamlyn approach.
Examining web latency: Performance analysis of a wide-area distributed system
, 2000
"... In this paper, we develop a methodology for determining where the time goes between when a user clicks on a Web page and when the page appears on the display. Our methodology uses only client-based measurements, requiring no special support from the Internet or the Web server. This is crucial to our ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, we develop a methodology for determining where the time goes between when a user clicks on a Web page and when the page appears on the display. Our methodology uses only client-based measurements, requiring no special support from the Internet or the Web server. This is crucial to our approach since performance bottlenecks may span many different administrative authorities, making coordinated measurement of the client, network and server very difficult in practice. Using only data available to any Web client, we are able to distinguish delays due to name translation, network propagation, network queueing, transport, the server, packet losses, and the browser with a combination of direct measurement, analytical modelling, and statistical inference. Our results show that there is no single network or server delay that dominates object transfer time, but that browser overheads are a significant fraction of overall latency. 1
An Efficient Transport Independent Active Messaging Implementation for PVM
, 1998
"... this paper. Thus, through a small amount of changes, PVM has gained complete protocol independence providing the developer with increased portability and the user with increased performance. For situations where an advanced network interface is available, PVMAM clearly outperforms stock PVM. The gre ..."
Abstract
- Add to MetaCart
this paper. Thus, through a small amount of changes, PVM has gained complete protocol independence providing the developer with increased portability and the user with increased performance. For situations where an advanced network interface is available, PVMAM clearly outperforms stock PVM. The greatest benefit coming from the bypass of the operating system and its buffers. New transport layers for the AM API are not particularly difficult to write, so the library can grow as networking technology progresses. For the normal case where no advanced network interface is available, our AM implementation performs on par with stock PVM communicating over directrouted TCP connections. On the Linux cluster, the UDP socket-based implementation of Active Messages actually performed much better than PVM, both in terms of latency and bandwidth. These characteristics can be attributed to the lower number of system calls, automatic flow control and the tuning of advanced socket options in the AM library. It is thought that with some minor changes to the transport layer and the message dispatch routines, the extra memory copy on the receive side could be eliminated. This modification would likely increase the performance well beyond direct-routed PVM over TCP for most platforms. 10 Future Directions
NAME CMAM- Introduction to the CM-5 Active Message communication layer. DESCRIPTION
"... The CM-5 Active Message layer CMAM () provides a set of communication primitives intended to expose the communication capabilities of the hardware to the programmer and/or compiler at a reasonably high level. A wide variety of programming models, ranging from send&receive message ..."
Abstract
- Add to MetaCart
The CM-5 Active Message layer CMAM (<see-mam>) provides a set of communication primitives intended to expose the communication capabilities of the hardware to the programmer and/or compiler at a reasonably high level. A wide variety of programming models, ranging from send&receive message

