Results 1 - 10
of
24
U-Net: A User-Level Network Interface for Parallel and Distributed Computing
- In Fifteenth ACM Symposium on Operating System Principles
, 1995
"... The U-Net communication architecture provides processes with a virtual view of a network interface to enable userlevel access to high-speed communication devices. The architecture, implemented on standard workstations using offthe-shelf ATM communication hardware, removes the kernel from the communi ..."
Abstract
-
Cited by 518 (14 self)
- Add to MetaCart
The U-Net communication architecture provides processes with a virtual view of a network interface to enable userlevel access to high-speed communication devices. The architecture, implemented on standard workstations using offthe-shelf ATM communication hardware, removes the kernel from the communication path, while still providing full protection. The model presented by U-Net allows for the construction of protocols at user level whose performance is only limited by the capabilities of network. The architecture is extremely flexible in the sense that traditional protocols like TCP and UDP, as well as novel abstractions like Active Messages can be implemented efficiently. A U-Net prototype on an 8-node ATM cluster of standard workstations offers 65 microseconds round-trip latency and 15 Mbytes/sec bandwidth. It achieves TCP performance at maximum network bandwidth and demonstrates performance equivalent to Meiko CS-2 and TMC CM-5 supercomputers on a set of Split-C benchmarks. 1
An Implementation and Analysis of the Virtual Interface Architecture
- IN PROCEEDINGS OF SC'98
, 1998
"... Rapid developments in networking technology and a rise in clustered computing have driven research studies in high performance communication architectures. In an effort to standardize the work in this area, industry leaders have developed the Virtual Interface Architecture (VIA) specification. This ..."
Abstract
-
Cited by 91 (7 self)
- Add to MetaCart
Rapid developments in networking technology and a rise in clustered computing have driven research studies in high performance communication architectures. In an effort to standardize the work in this area, industry leaders have developed the Virtual Interface Architecture (VIA) specification. This architecture seeks to provide an operating system-independent infrastructure for high-performance user-level networking in a generic environment. This paper evaluates the inherent costs and performance potential of the Virtual Interface Architecture through a prototype implementation over Myrinet. The VIA prototype is compared against established research user-level networks using simple communication benchmarks on the same hardware. We consider extensions to the VI Architecture that improve its performance for certain types of communication traffic and outline further research areas in the VIA design space that merit investigation.
High-Performance LocalArea Communication With Fast Sockets
- In Proceedings of the USENIX Technical Conference
, 1997
"... Modern switched networks such as ATM and Myrinet enable low-latency, high-bandwidth communication. This performance has not been realized by current applications, because of the high processing overheads imposed by existing communications software. These overheads are usually not hidden with large p ..."
Abstract
-
Cited by 62 (2 self)
- Add to MetaCart
Modern switched networks such as ATM and Myrinet enable low-latency, high-bandwidth communication. This performance has not been realized by current applications, because of the high processing overheads imposed by existing communications software. These overheads are usually not hidden with large packets; most network traffic is small. We have developed Fast Sockets, a local-area communication layer that utilizes a high-performance protocol and exports the Berkeley Sockets programming interface. Fast Sockets realizes round-trip transfer times of 60 microseconds and maximum transfer bandwidth of 33 MB/second between two UltraSPARC 1s connected by a Myrinet network. Fast Sockets obtains performance by collapsing protocol layers, using simple buffer management strategies, and utilizing knowledge of packet destinations for direct transfer into user buffers. Using receive posting, we make the Sockets API a single-copy communications layer and enable regular Sockets programs to exploit the performance of modern networks. Fast Sockets transparently reverts to standard TCP/IP protocols for wide-area communication.
Parallelizing the Murφ verifier
- Computer Aided Verification. 9th International Conference
, 1997
"... With the use of state and memory reduction techniques in verification by explicit state enumeration, runtime becomes a major limiting factor. We describe a parallel version of the explicit state enumeration verifier Murφ for distributed memory multiprocessors and networks of workstations that is ba ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
With the use of state and memory reduction techniques in verification by explicit state enumeration, runtime becomes a major limiting factor. We describe a parallel version of the explicit state enumeration verifier Murφ for distributed memory multiprocessors and networks of workstations that is based on the message passing paradigm. In experiments with three complex cache coherence protocols, parallel Murφ shows close to linear speedups, which are largely insensitive to communication latency and bandwidth. There is some slowdown with increasing communication overhead, for which a simple yet relatively accurate approximation formula is given. Techniques to reduce overhead and required bandwidth and to allow heterogeneity and dynamically changing load in the parallel machine are discussed, which we expect will allow good speedups when using conventional networks of workstations.
The Web Service Discovery Architecture
, 2002
"... In this paper, we propose the Web Service Discovery Architecture (WSDA). At runtime, Grid applications can use this architecture to discover and adapt to remote services. WSDA promotes an interoperable web service discovery layer by defining appropriate services, interfaces, operations and protocol ..."
Abstract
-
Cited by 31 (6 self)
- Add to MetaCart
In this paper, we propose the Web Service Discovery Architecture (WSDA). At runtime, Grid applications can use this architecture to discover and adapt to remote services. WSDA promotes an interoperable web service discovery layer by defining appropriate services, interfaces, operations and protocol bindings, based on industry standards. It is unified because it subsumes an array of disparate concepts, interfaces and protocols under a single semi-transparent umbrella. It is modular because it defines a small set of orthogonal multipurpose communication primitives (building blocks) for discovery. These primitives cover service identification, service description retrieval, data publication as well as minimal and powerful query support. The architecture is open and flexible because each primitive can be used, implemented, customized and extended in many ways. It is powerful because the individual primitives can be combined and plugged together by specific clients and services to yield a wide range of behaviors and emerging synergies.
Performance and Experience with LAPI - a New High-Performance
- Communication Library for the IBM RS/6000 SP. In Proceedings of the International Parallel Processing Symposium
, 1998
"... LAPI is a low-level, high-performance communication interface available on the IBM RS/6000 SP system. It provides an activemessage-like interface along with remote memory copy and synchronization functionality. It is designed primarily for use by experienced programmers in developing parallel subsys ..."
Abstract
-
Cited by 27 (8 self)
- Add to MetaCart
LAPI is a low-level, high-performance communication interface available on the IBM RS/6000 SP system. It provides an activemessage-like interface along with remote memory copy and synchronization functionality. It is designed primarily for use by experienced programmers in developing parallel subsystems, libraries and tools, but we also expect power programmers to use it in end-user applications. IBM developed LAPI as a part of a project with Pacific Northwest National Laboratory (PNNL) to optimize the performance of the Global Arrays (GA) toolkit and its applications on the IBM RS/6000 SP. We provide an overview of LAPI characteristics and discuss its differences from other models such as MPI-2. We present some base performance parameters of LAPI including latency and bandwidth and compare it with performance of the MPI/MPL. The Global Arrays library from PNNL was ported to LAPI to exploit the performance benefits of this new interface. Experience using LAPI to implement GA and the performance of the resulting library are presented. 1
Runtime Support For Portable Distributed Data Structures
, 1995
"... Multipol is a library of distributed data structures designed for irregular applications, including those with asynchronous communication patterns. In this paper, we describe the Multipol runtime layer, which provides an efficient and portable abstraction underlying the data structures. It contains ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
Multipol is a library of distributed data structures designed for irregular applications, including those with asynchronous communication patterns. In this paper, we describe the Multipol runtime layer, which provides an efficient and portable abstraction underlying the data structures. It contains a thread system to express computations with varying degrees of parallelism and to support multiple threads per processor for hiding communication latency. To simplify programming in a multithreaded environment, Multipol threads are small, finite-length computations that are executed atomically. Rather than enforcing a single scheduling policy on threads, users may write their own schedulers or choose one of the schedulers provided by Multipol. The system is designed for distributed memory architectures and performs communication optimizations such as message aggregation to improve efficiency on machines with high communication startup overhead. The runtime system currently runs on the Think...
Algorithms for Supporting Compiled Communication
- IEEE Transactions on Parallel and Distributed Systems
, 2003
"... In this paper, we investigate the compiler algorithms to support compiled communication in multiprocessor environments and study the benefits of compiled communication assuming that the underlying network is an all-optical Time-Division-Multiplexing (TDM) network. We present an experimental compil ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
In this paper, we investigate the compiler algorithms to support compiled communication in multiprocessor environments and study the benefits of compiled communication assuming that the underlying network is an all-optical Time-Division-Multiplexing (TDM) network. We present an experimental compiler, E-SUIF, that supports compiled communication for High Performance Fortran (HPF) like programs on all-optical TDM networks, describe and evaluate the compiler algorithms used in E-SUIF. We further demonstrate the effectiveness of compiled communication on all-optical TDM networks by comparing the performance of compiled communication with that of a traditional communication method using a number of application programs.
A Unified Peer-to-Peer Database Framework for Xqueries over Dynamic Distributed Content and Its Application for Scalable Service Discovery
, 2002
"... In a large distributed system spanning administrative domains such as a Grid, it is desirable to maintain and query dynamic and timely information about active participants such as services, resources and user communities. The web services vision promises that programs are made more flexible and pow ..."
Abstract
-
Cited by 13 (10 self)
- Add to MetaCart
In a large distributed system spanning administrative domains such as a Grid, it is desirable to maintain and query dynamic and timely information about active participants such as services, resources and user communities. The web services vision promises that programs are made more flexible and powerful by querying Internet databases (registries) at runtime in order to discover information and network attached third-party building blocks. Services can advertise themselves and related metadata via such databases, enabling the assembly of distributed higher-level components. In support of this vision, this thesis shows how to support expressive general-purpose queries over a view that integrates autonomous dynamic database nodes from a wide range of distributed system topologies.

