A Component Architecture for LAM/MPI (2003)

Cached

Download Links

by Jeffrey M. Squyres , Andrew Lumsdaine
Venue:In Proceedings, 10th European PVM/MPI Users’ Group Meeting, number 2840 in Lecture Notes in Computer Science
Citations:63 - 11 self

Active Bibliography

1 Parallel Checkpoint/Restart for MPI Applications – Sriram Sankaran, Jeffrey M. Squyres, Brian Barrett, Andrew Lumsdaine
67 The LAM/MPI checkpoint/restart framework: System-initiated checkpointing – Sriram Sankaran, Jeffrey M. Squyres, Brian Barrett, Andrew Lumsdaine - 2003
LAM/MPI Installation Guide Version 7.1.1 The LAM/MPI Team Open Systems Lab – unknown authors - 2004
LAM/MPI Installation Guide Version 7.1.2 The LAM/MPI Team Open Systems Lab – unknown authors
8 Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI – Camille Coti, Thomas Herault, Pierre Lemarinier, Ala Rezmerita, Eric Rodriguez - 2006
2 Towards MPI progression layer elimination with TCP and SCTP – Bradley Thomas Penoff - 2006
16 Improved message logging versus improved coordinated checkpointing for fault tolerant MPI – Pierre Lemarinier, Aurelien Bouteiller, Thomas Herault, Geraud Krawezik - 2004
5 MPICH-V Project: a Multiprotocol Automatic Fault Tolerant MPI – Aurelien Bouteiller , Franck Cappello , Thomas Herault, Geraud Krawezik, Pierre Lemarinier , Frederic Magniette
3 Interconnect agnostic checkpoint/restart in Open MPI – Joshua Hursey, Timothy I. Mattox, Andrew Lumsdaine - 2009
22 The Component Architecture of Open MPI: Enabling Third-Party Collective Algorithms – Jeffrey M. Squyres, Andrew Lumsdaine - 2004
2 Improving the Communication Subsystem Performance of WARPED – Umesh Kumar V. Rajasekaran, Umesh Kumar, V. Rajasekaran - 1998
208 MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface – Nicholas T. Karonis, Brian Toonen, Ian Foster - 2002
2 Implementing High-Level Parallelism on Computational GRIDs – Abdallah Deeb, I. Al Zain, Phil Trinder, Greg Michaelson (supervisors - 2006
LAM/MPI User's Guide – Version The Lam - 2004
6 A checkpoint and restart service specification for open mpi – Joshua Hursey, Jeffrey M. Squyres, Andrew Lumsdaine - 2006
ABSTRACT WANG, CHAO. Transparent Fault Tolerance for Job Healing in HPC Environments. – Chao Wang
15 A job pause service under lam/mpi+blcr for transparent fault tolerance – Chao Wang, Frank Mueller, Christian Engelmann, Stephen L. Scott - 2007
2 Recent Advances in Checkpoint/Recovery Systems – Greg Bronevetsky, Rohit Fern, Daniel Marques, Keshav Pingali, Paul Stodghill
1 Improving MPI Multicast Performance over Grid Environment using Intelligent Message Scheduling – Theewara Vorakosit, Putchong Uthayopas - 2004