Results 1 -
3 of
3
Closure and Convergence: A Foundation of Fault-Tolerant Computing
- IEEE Transactions on Software Engineering
, 1993
"... We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the ..."
Abstract
-
Cited by 103 (28 self)
- Add to MetaCart
We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the system state remains within that larger set (Closure). And two, if faults stop occurring, the system eventually reaches a state within the legal set (Convergence). We demonstrate the applicability of our definition for specifying and verifying the fault-tolerance properties of a variety of digital and computer systems. Further, using the definition, we obtain a simple classification of fault-tolerant systems and discuss methods for their systematic design. as traditionally been studied in the context of specifi...
Constraint Satisfaction as a Basis for Designing Nonmasking Fault-Tolerance
, 1996
"... We present a method for the design of nonmasking fault-tolerant programs. In our method, a set of constraints is associated with each program. As long as faults do not occur, the constraints are continually satisfied under the execution of program actions. Whenever some of the constraints are violat ..."
Abstract
-
Cited by 23 (9 self)
- Add to MetaCart
We present a method for the design of nonmasking fault-tolerant programs. In our method, a set of constraints is associated with each program. As long as faults do not occur, the constraints are continually satisfied under the execution of program actions. Whenever some of the constraints are violated, due to certain faults, all constraints are eventually reestablished by subsequent execution of the program actions. To design programs thus, two types of program actions are distinguished: "closure" actions and "convergence " actions. Closure actions are the actions that perform the intended computation of the program when all of the constraints are satisfied. Convergence actions are the actions that reestablish the constraints when they have been violated. Sufficient conditions for the validation of closure and convergence actions are formalized in terms of a "constraint graph". These conditions are illustrated by designing nonmasking fault-tolerant programs for diffusing computations, ...
A Self-Stabilizing Leader Election Algorithm for Tree Graphs
- Journal of Parallel and Distributed Computing
, 1996
"... We propose a self stabilizing algorithm (protocol) for leader election in a tree graph. We show the correctness of the proposed algorithm by using a new technique involving induction. 1 Introduction In a distributed system the computing elements or nodes exchange information only by message passing. ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
We propose a self stabilizing algorithm (protocol) for leader election in a tree graph. We show the correctness of the proposed algorithm by using a new technique involving induction. 1 Introduction In a distributed system the computing elements or nodes exchange information only by message passing. Every node has a set of local variables whose contents specify the local state of the node. The state of the entire system, called the global state, is the union of the local states of all the nodes in the system. Each node is allowed to have only a partial view of the global state, and this depends on the connectivity of the system and the propagation delay of different messages. Yet, the objective in a distributed system is to arrive at a desirable global final state (legitimate state), defined by some invariance relation on the global state. Systems that reach the legitimate state starting from any arbitrary (possibly illegitimate) state in a finite number of steps are called self-stabil...

