Results 1 - 10
of
21
A high-performance, portable implementation of the MPI message passing interface standard
- Parallel Computing
, 1996
"... MPI (Message Passing Interface) is a specification for a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists. Multiple implementations of MPI have been developed. In this paper, we d ..."
Abstract
-
Cited by 651 (37 self)
- Add to MetaCart
MPI (Message Passing Interface) is a specification for a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists. Multiple implementations of MPI have been developed. In this paper, we describe MPICH, unique among existing implementations in its design goal of combining portability with high performance. We document its portability and performance and describe the architecture by which these features are simultaneously achieved. We also discuss the set of tools that accompany the free distribution of MPICH, which constitute the beginnings of a portable parallel programming environment. A project of this scope inevitably imparts lessons about parallel computing, the specification being followed, the current hardware and software environment for parallel computing, and project management; we describe those we have learned. Finally, we discuss future developments for MPICH, including those necessary to accommodate extensions to the MPI Standard now being contemplated by the MPI Forum. 1
Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations (Extended Abstract)
"... This paper reports on the architecture and design of Starfish, an environment for executing dynamic (and static) MPI-2 programs on a cluster of workstations. Starfish is unique in being efficient, faulttolerant, highly available, and dynamic as a system internally, and in supporting fault-tolerance ..."
Abstract
-
Cited by 84 (6 self)
- Add to MetaCart
This paper reports on the architecture and design of Starfish, an environment for executing dynamic (and static) MPI-2 programs on a cluster of workstations. Starfish is unique in being efficient, faulttolerant, highly available, and dynamic as a system internally, and in supporting fault-tolerance and dynamicity for its application programs as well. Starfish achieves these goals by combining group communication technology with checkpoint/restart, and uses a novel architecture that is both flexible and portable and keeps group communication outside the critical data path, for maximum performance.
Managing Multiple Communication Methods in High-Performance Networked Computing Systems
- Journal of Parallel and Distributed Computing
, 1997
"... Modern networked computing environments and applications often require---or can benefit from---the use of multiple communication substrates, transport mechanisms, and protocols, chosen according to where communication is directed, what is communicated, or when communication is performed. We propose ..."
Abstract
-
Cited by 79 (13 self)
- Add to MetaCart
Modern networked computing environments and applications often require---or can benefit from---the use of multiple communication substrates, transport mechanisms, and protocols, chosen according to where communication is directed, what is communicated, or when communication is performed. We propose techniques that allow multiple communication methods to be supported transparently in a single application, with either automatic or user-specified selection criteria guiding the methods used for each communication. We explain how communication link and remote service request mechanisms facilitate the specification and implementation of multimethod communication. These mechanisms have been implemented in the Nexus multithreaded runtime system, and we use this system to illustrate solutions to various problems that arise when implementing multimethod communication. We also illustrate the application of our techniques by describing a multimethod, multithreaded implementation of the Message Pas...
Wide-Area Implementation of the Message Passing Interface
- PARALLEL COMPUTING
, 1998
"... The Message Passing Interface (MPI) can be used as a portable, high-performance programming model for wide-area computing systems. The wide-area environmentintroduces challenging problems for the MPI implementor, due to the heterogeneity of both the underlying physical infrastructure and the softwar ..."
Abstract
-
Cited by 43 (10 self)
- Add to MetaCart
The Message Passing Interface (MPI) can be used as a portable, high-performance programming model for wide-area computing systems. The wide-area environmentintroduces challenging problems for the MPI implementor, due to the heterogeneity of both the underlying physical infrastructure and the software environment at different sites. In this article, we describe an MPI implementation that incorporates solutions to these problems. This implementation has been constructed by extending the Argonne MPICH implementation of MPI to use communication services provided by the Nexus communication library and authentication, resource allocation, process creation/management, and information services provided by the I-Soft system (initially) and the Globus metacomputing toolkit (work in progress). Nexus provides multimethod communication mechanisms that allowmultiple communication methods to be used in a single computation with a uniform interface; I-Soft and Globus provided standard authent...
MPI-StarT: Delivering Network Performance to Numerical Applications
- In SC
, 1998
"... : We describe an MPI implementation for a cluster of SMPs interconnected by a high-performance interconnect. This work is a collaboration between a numerical applications programmer and a cluster interconnect architect. The collaboration started with the modest goal of satisfying the communication ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
: We describe an MPI implementation for a cluster of SMPs interconnected by a high-performance interconnect. This work is a collaboration between a numerical applications programmer and a cluster interconnect architect. The collaboration started with the modest goal of satisfying the communication needs of a specific numerical application, MITMatlab. However, by supporting the MPI standard MPI-StarT readily extends support to a host of applications. MPI-StarT is derived from MPICH by developing a custom implementation of the Channel Interface. Some changes in MPICH's ADI and Protocol Layers are also necessary for correct and optimal operation. MPI-StarT relies on the host SMPs' shared memory mechanism for intra-SMP communication. Inter-SMP communication is supported through StarT-X. The StarT-X NIU allows a cluster of PCI-equipped host platforms to communicate over the Arctic Switch Fabric. Currently, StarT-X is utilized by a cluster of SUN E5000 SMPs as well as a cluster of Intel Pen...
MPI on the I-WAY: A Wide-Area, Multimethod Implementation of the Message Passing Interface
, 1996
"... High-speed wide-area networks enable innovative ap-plications that integrate geographically distributed com-puting, database, graphics, and networking resources. The Message Passing Interface (MPI) can be used as a portable, high-performance programming model for such systems. However, the wide-area ..."
Abstract
-
Cited by 16 (8 self)
- Add to MetaCart
High-speed wide-area networks enable innovative ap-plications that integrate geographically distributed com-puting, database, graphics, and networking resources. The Message Passing Interface (MPI) can be used as a portable, high-performance programming model for such systems. However, the wide-area environment in-troduces challenging problems for the MPI implementor, because of the heterogeneity of both the underlying physical infrastructure and the authentication and software environment at different sites. In this article, we describe an MPI implementation that incorporates so-lutions to these problems. This implementation, which was developed for the I-WAY distributed-computing ex-periment, was constructed by layering MPICH on the Nexus multithreaded runtime system. Nexus provides automatic configuration mechanisms that can be used to select and configure authentication, process creation, and communication mechanisms in heterogeneous systems.
COMPaS: A Pentium Pro PC-based SMP Cluster and its Experience
, 1998
"... . We have built an eight node SMP cluster called COMPaS (Cluster Of Multi-Processor Systems), each node of which is a quadprocessor Pentium Pro PC. We have designed and implemented a remote memory based user-level communication layer which provides lowoverhead and high bandwidth using Myrinet. We de ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
. We have built an eight node SMP cluster called COMPaS (Cluster Of Multi-Processor Systems), each node of which is a quadprocessor Pentium Pro PC. We have designed and implemented a remote memory based user-level communication layer which provides lowoverhead and high bandwidth using Myrinet. We designed a hybrid programming model in order to take advantage of locality in each SMP node. Intra-node computations utilize a multi-threaded programming style (Solaris threads) and inter-node programming is based on message passing and remote memory operations. In this paper we report on this hybrid shared memory/distributed memory programming on COMPaS and its preliminary evaluation. The performance of COMPaS is affected by data size and access patterns, and the proportion of inter-node communication. If the data size is small enough to all fit on the cache, parallel efficiency exceeds 1.0 using the hybrid programming model on COMPaS. But the performance is limited by the low memory bus band...
An Operating System Support to Low-Overhead Communications in NOW Clusters
- In Proceedings of Communication and Architectural Support for Net workBased Parallel Computing CANPC97
, 1997
"... . This paper describes an Operating System approach to the problem of delivering low latency high bandwidth communications for PC clusters running a public domain OS like Linux and connected by standard, off-the-shelf networks like Fast-Ethernet. The PARMA 2 project has the main goal of design ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
. This paper describes an Operating System approach to the problem of delivering low latency high bandwidth communications for PC clusters running a public domain OS like Linux and connected by standard, off-the-shelf networks like Fast-Ethernet. The PARMA 2 project has the main goal of designing the new light-weight protocol suite PRP, in order to drastically reduce the software overhead introduced by TCP/IP. PRP wants to offer at high level a stream socket oriented interface and at low level compatibility with any device driver. High level compatibility is crucial in facilitating the porting on PRP of existing applications or message passing packages. Moreover, an optimized version of MPI, based on PRP and evolution of the widespread MPICH implementation, is under development, allowing for a very effective reduction of the communication latencies in synchronous communications, compared to the TCP/IP-based MPI. 1 Introduction Today the use of workstation or PC cluster...
Parallel Raytracing: A Case Study on Partitioning and Scheduling on Workstation Clusters
- in Proc. Thirtieth International Conference on System Sciences, Hawaii
, 1997
"... In this paper, a case study is presented which is aimed at investigating the performance of several parallel versions of the POV--Ray raytracing package implemented on a workstation cluster using the MPI message passing library. Based on a manager/worker scheme, variants of workload partitioning and ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
In this paper, a case study is presented which is aimed at investigating the performance of several parallel versions of the POV--Ray raytracing package implemented on a workstation cluster using the MPI message passing library. Based on a manager/worker scheme, variants of workload partitioning and message scheduling strategies, in conjunction with different task granularities, are evaluated with respect to their runtime behaviour. The results indicate that dynamic, adaptive strategies are required to cope with both the unbalanced workload characteristics of the parallel raytracing application and the different computational capabilities of the machines in a workstation cluster environment. 1 Introduction Raytracing [9, 13, 24] is a widely used method for generating realistically looking images on a computer, and it is employed by many 3D modelling and animation systems for the final image rendering. The input to a raytracing algorithm is the scene -- the description of the geometry...
A multi-threaded Message Passing Interface (MPI) architecture: performance and program issues
"... This paper discusses a multi-threaded software architecture for Message-Passing Interface (MPI) software specification. The architecture is thread-safe, allows for concurrent communication over several communications media (multi-fabric communication), efficiently utilizes available hardware concurr ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
This paper discusses a multi-threaded software architecture for Message-Passing Interface (MPI) software specification. The architecture is thread-safe, allows for concurrent communication over several communications media (multi-fabric communication), efficiently utilizes available hardware concurrency over a wide range of target platforms, and allows for concurrent communication and computation within the limits imposed by the hardware. The architecture is developed in the framework of the MPICH software architecture, a well-known MPI implementation used worldwide. The proposed architecture adopts wide portability of the MPICH design and remedies some of its deficiencies such as inefficient multifabric communication and non-thread-safety. The paper also considers the issues concerning development of high-performance portable message-passing systems for general-purpose architectures. The contributions of the paper are: improving architecture and addressing thread safety of modern reli...

