Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations (Extended Abstract)

by Adnan M. Agbaria, et al.
Citations:87 - 6 self

Active Bibliography

Reliability in High Performance Distributed Computing Systems – Adnan Agbaria
546 A Survey of Rollback-Recovery Protocols in Message-Passing Systems – E. N. ( Mootaz) Elnozahy, Lorenzo Alvisi, Yi-min Wang, David B. Johnson - 1996
719 A high-performance, portable implementation of the MPI message passing interface standard – Ewing Lusk, Nathan Doss, Anthony Skjellum - 1996
1076 The physiology of the grid: An open grid services architecture for distributed systems integration – Ian Foster - 2002
537 U-Net: A User-Level Network Interface for Parallel and Distributed Computing – Thorsten Von Eicken, Anindya Basu, Vineet Buch, Werner Vogels - 1995
631 Design and Evaluation of a Wide-Area Event Notification Service – Antonio Carzaniga, David S. Rosenblum, Alexander L. Wolf
997 A reliable multicast framework for light-weight sessions and application level framing – Sally Floyd, Van Jacobson, Ching-gung Liu, Steven Mccanne, Lixia Zhang - 1995
627 Virtual Time and Global States of Distributed Systems – Friedemann Mattern - 1988
862 Virtual time – David R. Jefferson - 1985