Results 1 
3 of
3
Closure and Convergence: A Foundation of FaultTolerant Computing
 IEEE Transactions on Software Engineering
, 1993
"... We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if ..."
Abstract

Cited by 133 (30 self)
 Add to MetaCart
(Show Context)
We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the system state remains within that larger set (Closure). And two, if faults stop occurring, the system eventually reaches a state within the legal set (Convergence). We demonstrate the applicability of our definition for specifying and verifying the faulttolerance properties of a variety of digital and computer systems. Further, using the definition, we obtain a simple classification of faulttolerant systems and discuss methods for their systematic design. as traditionally been studied in the context of specifi...
Selfstabilizing mutual exclusion in the presence of faulty nodes
 In Proceedings of the 25th International Symposium on Fault Tolerant Computing Digest of Papers
, 1995
"... ..."
A WaitFree Sorting Algorithm
, 1997
"... Sorting is one of a set of fundamental problems in computer science. In this paper we present the first waitfree algorithm for sorting an input array of size N using P N processors to achieve optimal running time. We show two variants of the algorithm, one deterministic and one randomized and p ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Sorting is one of a set of fundamental problems in computer science. In this paper we present the first waitfree algorithm for sorting an input array of size N using P N processors to achieve optimal running time. We show two variants of the algorithm, one deterministic and one randomized and prove that, with high probability, the latter suffers no more than O( p P ) contention when run synchronously. Known sorting algorithms, when made waitfree through previously established transformation techniques, have complexity O(log 3 N ). The algorithm we present here, when run in the CRCW PRAM model, executes with high probability in optimal O(log N) time when P = N , and O(N log N=P ) otherwise. The waitfree property guarantees that the sort will complete despite any delays or failures incurred by the processors. This is a very desirable property from an operating systems point of view, since it allows oblivious thread scheduling as well as thread creation and deletion, w...