Results 1 -
3 of
3
An MPI Implementation of the BLACS
- in Proc. 2nd MPI Developers Conf., (MPIDC'96, Notre
, 1996
"... An MPI implementation of the Basic Linear Communication Subprograms (BLACS) is presented. A wide spectrum of MPI functionality has been used to implement BLACS as succinctly as possible, thus making the implementation concise, but still yielding good performance. We discuss some of the implementatio ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
An MPI implementation of the Basic Linear Communication Subprograms (BLACS) is presented. A wide spectrum of MPI functionality has been used to implement BLACS as succinctly as possible, thus making the implementation concise, but still yielding good performance. We discuss some of the implementation details and present performance results for several parallel architectures with different MPI libraries. Finally, we gather our experiences in using MPI, and make some suggestions for the future functionality. 1. Introduction In this paper an MPI [9] implementation of the Basic Linear Algebra Communication Subprograms (BLACS) is presented. The BLACS are message passing routines that communicate matrices among processes arranged in a twodimensional virtual process topology. It forms the basic communication layer for ScaLAPACK [2, 1]. MPI provides the most suitable message-passing layer for BLACS, since it is widely available, has high level functionality to support the BLACS communication ...
An MPI Implementation of the BLACS
, 1996
"... . In this report, an MPI implementation of the Basic Linear Communication Subprograms (BLACS) is presented. A wide spectrum of MPI functionality has been used to implement BLACS as succinctly as possible, thus making the implementation concise, but still yielding good performance. We discuss some of ..."
Abstract
- Add to MetaCart
. In this report, an MPI implementation of the Basic Linear Communication Subprograms (BLACS) is presented. A wide spectrum of MPI functionality has been used to implement BLACS as succinctly as possible, thus making the implementation concise, but still yielding good performance. We discuss some of the implementation details and present results for several different architectures with different MPI libraries. Finally, we gather our experiences in using MPI, and make some suggestions for the future functionality in MPI-2. The MPI-BLACS library is available free under copyright for research purposes. Keywords. MPI, BLACS, parallel architectures, libraries This work was performed as part of the Joint CSCS/NEC Collaboration in Parallel Processing and will be presented at the MPI Developers Conference, 1996. 1 Swiss Center for Scientific Computing (CSCS/SCSC-ETH), Via Cantonale, CH-6928 Manno, Switzerland vaibhav@cscs.ch and sawyer@cscs.ch 2 Dept. of Computer Science, University of W...
The Performance of Fast Givens Rotations Problem Implemented with MPI Extensions in Multicomputers
"... In this paper, issues related to implementing an MPI version of the fast Givens rotations problem are investigated. We have chosen this algorithm because it has the feature of having no predictable communication pattern. Message Passing Interface (MPI) is an attempt to standardise the communication ..."
Abstract
- Add to MetaCart
In this paper, issues related to implementing an MPI version of the fast Givens rotations problem are investigated. We have chosen this algorithm because it has the feature of having no predictable communication pattern. Message Passing Interface (MPI) is an attempt to standardise the communication library for distributed memory computing systems. The message-passing paradigm is attractive because of its wide portability and scalability. It is easily compatible with both distributed-memory multicomputers and shared-memory multiprocessors, with NOWs and also combinations of these elements. Currently, there are several commercial and free, public-domain, implementations of MPI. We have chosen the most common implementation of MPI called MPICH. In this paper we show the MPI algorithm of the fast Givens rotations and give some preliminary results about the performance in a network of personal computers. Our results will also point out the strength and weakness of the implementation.

