Results 1 -
5 of
5
MPI-FM: High Performance MPI on Workstation Clusters
- Journal of Parallel and Distributed Computing
, 1997
"... Despite the emergence of high speed LANs, the communication performance available to applications on workstation clusters still falls short of that available on MPPs. A new generation of efficient messaging layers is needed to take advantage of the hardware performance and to deliver it to the appli ..."
Abstract
-
Cited by 71 (13 self)
- Add to MetaCart
Despite the emergence of high speed LANs, the communication performance available to applications on workstation clusters still falls short of that available on MPPs. A new generation of efficient messaging layers is needed to take advantage of the hardware performance and to deliver it to the application level. Communication software is the key element in bridging the communication performance gap separating MPPs and workstation clusters. MPI-FM is a high performance implementation of MPI for networks of workstations connected with a Myrinet network, built on top of the Fast Messages (FM) library. Based on the FM version 1.1 released in Fall 1995, MPI-FM achieves a minimum oneway latency of 19 ¯s and a peak bandwidth of 17.3 MByte/s with common MPI send and receive function calls. A direct comparison using published performance figures shows that MPI-FM running on SPARCstation 20 workstations connected with a relatively inexpensive Myrinet network outperforms the MPI implementations a...
Implementing MPI using Interrupts and Remote Copying for the AP1000/AP1000+
- Fujitsu Laboratories Ltd
, 1995
"... This paper documents an experimental MPI [1] library which has been built for both the AP1000 [3] and AP1000+ [2, 4] machines. Although the previous implementation of MPI [5, 6, 7] produced messaging performance that was almost identical to using CellOS calls, the library contained a number of unsat ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper documents an experimental MPI [1] library which has been built for both the AP1000 [3] and AP1000+ [2, 4] machines. Although the previous implementation of MPI [5, 6, 7] produced messaging performance that was almost identical to using CellOS calls, the library contained a number of unsatisfactory features. These features will be identified, and a new communication mechanism will be described which uses interrupts and the get
An Efficient Implementation of the Message Passing Interface (MPI) on the Fujitsu AP1000
- Proceedings of the Third Parallel Computing Workshop
, 1994
"... The message passing interface standard released in April 1994 by the MPI Forum [2], defines a set of message passing primitives for multicomputers and clustered systems. The standard provides a large collection of functions, with the aim of providing efficient implementations, source code portabilit ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The message passing interface standard released in April 1994 by the MPI Forum [2], defines a set of message passing primitives for multicomputers and clustered systems. The standard provides a large collection of functions, with the aim of providing efficient implementations, source code portability and support for the development of parallel libraries. In this paper, we describe the implementation and performance of MPI on the Fujitsu AP1000. To produce an efficient implementation, the existing operating system had to be modified to better support MPI operations. These modifications are discussed, along with the hardware operations that were utilised. A selective broadcast operation was developed which provided efficient implementations of many of the collective routines regardless of group size. The performance of MPI and CellOS point-to-point and broadcast operations are compared, along with benchmarks of some of the MPI collective routines. More details of the implementation may...
High Performance MPI Implementation On A Network Of Workstations
, 1996
"... Despite the emergence of fast LANs, the communication performance available to user applications on workstations still falls short of the level of performance seen on MPPs. A radically new generation of messaging layers are needed to take advantage of the hardware performance and to deliver it to th ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Despite the emergence of fast LANs, the communication performance available to user applications on workstations still falls short of the level of performance seen on MPPs. A radically new generation of messaging layers are needed to take advantage of the hardware performance and to deliver it to the application level. The thesis presented in this work is that is possible to achieve high enough application level performance on a network of workstations to compete with commodity MPPs. The thesis is proved by presenting MPI-FM, a high performance implementation of a industry standard user level library realized on a network of workstation, and by comparing the achieved results with those published for the same library on two MPPs. When compared to MPI-F, one of the fastest incarnations of the standard, MPI-FM outperforms MPI-F in latency and bandwidth for message of 2KB of size or less. MPI-FM is a high performance full implementation of MPI for network of workstations connected with a M...
Mapping MPI to Machine: Implementing the MPI Standard
"... Device Interface (ADI), which is implemented as compiler macros. The ADI provides four main functions: Sending and receiving, data transfer, queuing, and device-dependent operations. The job of the implementor is thus to tailor the lower, device-dependent layer to the target machine; the upper, devi ..."
Abstract
- Add to MetaCart
Device Interface (ADI), which is implemented as compiler macros. The ADI provides four main functions: Sending and receiving, data transfer, queuing, and device-dependent operations. The job of the implementor is thus to tailor the lower, device-dependent layer to the target machine; the upper, deviceindependent layer remains virtually unchanged. Since one of the goals of MPICH is to demonstrate the efficiency of MPI, several optimizations are included. One of them is optimization by message length. Four send protocols are supported. The short send protocol piggybacks the message inside of the message envelope. The eager send protocol delivers the message data without waiting for the sender to request it, on the assumption that the probability is high that the receiver will accept the message. The rendezvous protocol doesn't deliver the data until the receiver explicitly requests it, thus allowing the setup time necessary to send large messages with high bandwidth. And the get protoco...

