A Component Architecture for LAM/MPI (2003)

Cached

Download Links

by Jeffrey M. Squyres , Andrew Lumsdaine
Venue:In Proceedings, 10th European PVM/MPI Users’ Group Meeting, number 2840 in Lecture Notes in Computer Science
Citations:75 - 11 self

Documents Related by Co-Citation

721 A high-performance, portable implementation of the MPI message passing interface standard – Ewing Lusk, Nathan Doss, Anthony Skjellum - 1996
152 Open MPI: Goals, concept, and design of a next generation MPI implementation – Edgar Gabriel, Graham E. Fagg, George Bosilca, Thara Angskun, Jack J. Dongarra, Jeffrey M. Squyres, Vishal Sahay, Prabhanjan Kambadur, Brian Barrett, Andrew Lumsdaine, Ralph H. Castain, David J. Daniel, Richard L. Graham, Timothy S. Woodall - 2004
212 LAM: An open cluster environment for MPI – G Burns, R Daoud, J Vaigl - 1994
64 A Network-Failure-tolerant Message-Passing system for Terascale Clusters – Richard L. Graham, Sung-eun Choi, David J. Daniel, Nehal N. Desai, Ronald G. Minnich, Craig E. Rasmussen, L. Dean Risinger, Mitchel W. Sukalski Introduction - 2003
32 HARNESS and fault tolerant MPI – Graham E. Fagg, Antonin Bukovsky, Jack J. Dongarra - 2001
84 The LAM/MPI checkpoint/restart framework: System-initiated checkpointing – Sriram Sankaran, Jeffrey M. Squyres, Brian Barrett, Andrew Lumsdaine - 2003
20 Architecture of LA-MPI, a network-fault-tolerant MPI – Rob T. Aulwes, David J. Daniel, Nehal N. Desai, Richard L. Graham, L. Dean Risinger, Mark A. Taylor, Timothy S. Woodall - 2004
83 The design and implementation of Berkeley Lab’s linux Checkpoint/Restart – Jason Duell - 2003
114 MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes – George Bosilca, Aurelien Bouteiller, Franck Cappello, Samir Djailali, Gilles Fedak, Cecile Germain, Thomas Herault, Pierre Lemarinier, Oleg Lodygensky, Frederic Magniette, Vincent Neri, Anton Selikhov - 2002
101 FT-MPI: Fault Tolerant MPI, supporting dynamic applications in a dynamic world – Graham E. Fagg, Jack J. Dongarra - 2000
271 Libckpt: Transparent Checkpointing under Unix – James S. Plank, Micah Beck, Gerry Kingsley, Kai Li - 1995
196 CoCheck: Checkpointing and Process Migration for MPI – Georg Stellner - 1996
151 MagPIe: MPI’s Collective Communication Operations for Clustered Wide Area Systems – Thilo Kielmann, Rutger F. H. Hofman, Henri E. Bal, Aske Plaat, Raoul A. F. Bhoedjang - 1999
18 Towards efficient execution of MPI applications on the grid: Porting and optimization issues – Rainer Keller, Edgar Gabriel, Bettina Krammer, Matthias S. Müller, Michael M. Resch - 2003
14 TEG: A high-performance, scalable, multi-network point-to-point communications methodology – T. S. Woodall, R. L. Graham, R. H. Castain, D. J. Daniel, M. W. Sukalski, E. Gabriel, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, P. Kambadur, B. Barrett, A. Lumsdaine - 2004
542 A Survey of Rollback-Recovery Protocols in Message-Passing Systems – E. N. ( Mootaz) Elnozahy, Lorenzo Alvisi, Yi-min Wang, David B. Johnson - 1996
78 Automated application-level checkpointing of MPI programs – Greg Bronevetsky, Daniel Marques, Keshav Pingali, Paul Stodghill - 2003
26 Collective Operations in an Application-level Fault Tolerant MPI System – Greg Bronevetsky, Daniel Marques, Keshav Pingali, Paul Stodghill - 2003
16 Proactive fault tolerance in large systems – S Chakravorty, C Mendes, L Kale - 2005