Results 11 - 20
of
407
Design and Implementation of MPICH2 over InfiniBand with RDMA Support ∗
, 2003
"... For several years, MPI has been the de facto standard for writing parallel applications. One of the most popular MPI implementations is MPICH. Its successor, MPICH2, features a completely new design that provides more performance and flexibility. To ensure portability, it has a hierarchical structur ..."
Abstract
- Add to MetaCart
structure based on which porting can be done at different levels. In this paper, we present our experiences designing and implementing MPICH2 over InfiniBand. Because of its high performance and open standard, InfiniBand is gaining popularity in the area of high-performance computing. Our study focuses
Designing Passive Synchronization for MPI-2 One-Sided Communication to Maximize Overlap ∗
"... Scientific computing has seen an immense growth in recent years. MPI has become the defacto standard for parallel programming model for distributed memory systems. MPI-2 standard also introduced the one-sided programming model. Computation and communication overlap is an important goal for one-sided ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
applications. While the passive synchronization mechanism for MPI-2 one-sided communication allows for good overlap, the actual overlap achieved is often limited by the design of both the MPI library and the application. In this paper we aim to improve the performance of MPI-2 one-sided communication
Performance of HPC Middleware over InfiniBand WAN ∗
"... High performance interconnects such as InfiniBand (IB) have enabled large scale deployments of High Performance Computing (HPC) systems. High performance communication and IO middleware such as MPI and NFS over RDMA have also been redesigned to leverage the performance of these modern interconnects. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
High performance interconnects such as InfiniBand (IB) have enabled large scale deployments of High Performance Computing (HPC) systems. High performance communication and IO middleware such as MPI and NFS over RDMA have also been redesigned to leverage the performance of these modern interconnects
RDMA-Based Job Migration Framework for MPI over InfiniBand
"... Abstract—Coordinated checkpoint and recovery is a common approach to achieve fault tolerance on large-scale systems. The traditional mechanism dumps the process image to a local disk or a central storage area of all the processes involved in the parallel job. When a failure occurs, the processes are ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
health-deteriorating node to a healthy spare node, and resume these processes from the spare node. RDMA-based process image transmission is designed to take advantage of high performance communication in InfiniBand. Experimental results show that the Job Migration scheme can achieve a speedup of 4
Natively supporting true one-sided communication in MPI on multi-core systems with InfiniBand
- in Proceedings of the 9th International Symposium of Cluster Computing and the Grid (CCGrid), 2009
"... As high-end computing systems continue to grow in scale, the per-formance that applications can achieve on such large scale systems depends heavily on their ability to avoid explicitly synchronized communication with other processes in the system. Accordingly, several modern and legacy parallel prog ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
-speed networks such as InfiniBand (IB) to allow for true one-sided communication in MPI. In this paper, we extend this work to natively take advantage of one-sided atomic operations on cache-coherent multi-core/multi-processor architec-tures while still utilizing the benefits of networks such as IB. Specif
Scalable Startup of Parallel Programs over InfiniBand
, 2004
"... Fast and scalable process startup is one of the major challenges in parallel computing over large scale clusters. The startup of a parallel job typically can be divided into two phases: process initiation and connection setup. Both of these phases can become performance bottlenecks. In this paper, w ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
, we characterize the startup of MPI programs in InfiniBand clusters and identify two startup scalability issues: serialized process initiation in the initiation phase and high communication overhead in the connection setup phase. We propose different approaches to reduce communication overhead
High Performance Implementation of MPI Derived Datatype Communication over InfiniBand
- in Proceedings of 18th International Parallel and Distributed Processing Symposium (IPDPS 2004
, 2004
"... MPI derived datatype is a powerful method to define arbitrary collections of noncontiguous data in memory and to enable noncontiguous data communication in a single MPI function call. It can be expected that MPI derived datatype could become a key aid in application development. In practice, however ..."
Abstract
- Add to MetaCart
MPI derived datatype is a powerful method to define arbitrary collections of noncontiguous data in memory and to enable noncontiguous data communication in a single MPI function call. It can be expected that MPI derived datatype could become a key aid in application development. In practice
A Portable InfiniBand Module for MPICH2/Nemesis: Design and Evaluation ∗
"... With the emergence of multi-core-based processors, it is becoming significantly important to optimize both intra-node and inter-node communication in an MPI stack. MPICH2 group has recently introduced a new Nemesis-based MPI stack which provides highly optimized design for intra-node communication. ..."
Abstract
- Add to MetaCart
. It also provides modular design for different inter-node networks. Currently, the MPICH2/Nemesis stack has support for TCP/IP and Myrinet only. The TCP/IP interface allows this stack to run on the emerging InfiniBand network with IPoIB support. However, this approach does not deliver good performance
High performance checksum computation for fault-tolerant MPI over InfiniBand
- in "the 19th European MPI Users’ Group Meeting (EuroMPI 2012
"... Abstract. With the increase of the number of nodes in clusters, the probability of failures and unusual events increases. In this paper, we present checksum mech-anisms to detect data corruption. We study the impact of checksums on network communication performance and we propose a mechanism to amor ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
to amortize their cost on InfiniBand. We have implemented our mechanisms in the NEWMADELEINE communication library. Our evaluation shows that our mechanisms to ensure mes-sage integrity do not impact noticeably the application performance, which is an improvement over the state of the art MPI implementations.
UPC Queues for Scalable Graph Traversals: Design and Evaluation on InfiniBand Clusters
"... PGAS languages like UPC are growing in popularity because of their ability to provide shared memory programming model over distributed memory machines. While this abstraction provides bet-ter programmability, some of the applications require mutual exclu-sion when operating on shared data. Locks are ..."
Abstract
- Add to MetaCart
) benchmark over the current version are about 14 % and 10 % for similar scale runs, respectively. Our work is based on the Berkeley UPC Runtime and the Unified Communication Runtime (UCR) for UPC and MPI, developed at The Ohio State University. 2.
Results 11 - 20
of
407