We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. We characterise unreliable failure detectors in terms of two properties completeness and accuracy. We show that Consensus can be solved even with unreliable failure detectors that make an infnite number of mistakes, and determine which ones can be used to solve Consensus despite any number of crashes, and which ones require a majority of correct processes. We prove that Consensus and Atomic Broadcast are reducible to each other in asynchronous systems with crash failures; thus the above results also apply to Atomic Broadcast. A companion paper shows that one of the failure detectors introduced here is the weakest failure detector for solving Consensus [Chandra et al. 1992].
|
1089
|
Impossibility of Distributed Consensus with One Faulty Process
– Fischer, Lynch, et al.
- 1985
|
|
867
|
The Byzantine Generals Problem
– Lamport, Shostak, et al.
- 1982
|
|
589
|
Implementing fault-tolerant services using the state machine approach: a tutorial
– Schneider
- 1990
|
|
455
|
Reliable Communication in the Presence of Failures
– Birman, Joseph
- 1987
|
|
397
|
Knowledge and common knowledge in a distributed environment
– Halpern, Moses
- 1990
|
|
374
|
Reaching agreement in the presence of faults
– Pease, Shostak, et al.
- 1980
|
|
343
|
Reliable broadcast protocols
– Chang, Maxemchuk
- 1984
|
|
326
|
Transis: A communication subsystem for high availability
– Amir, Dolev, et al.
- 1992
|
|
307
|
Consensus in the Presence of Partial Synchrony
– Dwork, Lynch, et al.
- 1988
|
|
298
|
The weakest failure detector for solving Consensus
– Chandra, Hadzilacos, et al.
- 1996
|
|
265
|
Fault-tolerant broadcasts and related problems
– Hadzilacos, Toueg
- 1993
|
|
212
|
Preserving and using context information in interprocess communication
– Peterson, Bucholz, et al.
- 1989
|
|
206
|
Atomic broadcast: From simple message diffusion to Byzantine agreement
– Cristian, Aghili, et al.
- 1985
|
|
194
|
On the minimal synchronism needed for distributed consensus
– Dolev, Dwork, et al.
- 1987
|
|
153
|
Another advantage of free choice: completely asynchronous agreement protocols (extended abstract
– Ben-Or
- 1983
|
|
153
|
Using process groups to implement failure detection in asynchronous environments
– Ricciardi, Birman
- 1991
|
|
150
|
Delta-4: A Generic Architecture for Dependable Distributed Computing
– Powell
- 1991
|
|
132
|
Memory requirements for agreement among unreliable asynchronous processes
– Loui, Abu-Amara
- 1987
|
|
122
|
A modular approach to fault-tolerant broadcasts and related problems
– Hadzilacos, Toueg
- 1994
|
|
116
|
Asynchronous consensus and broadcast protocols
– Bracha, Toueg
- 1985
|
|
94
|
The Consensus Problem in Unreliable Distributed Systems (A Brief Survey
– Fischer
- 1983
|
|
78
|
Reaching approximate agreement in the presence of faults
– Dolev, Lynch, et al.
- 1986
|
|
72
|
SIFT: Design and analysis of a fault-tolerant computer for aircraft control
– Wensley
|
|
69
|
Revisiting the relationship between non blocking atomic commitment and consensus
– Guerraoui
- 1995
|
|
59
|
The implementation of reliable distributed multiprocess systems
– Lamport
- 1978
|
|
58
|
Automatically increasing the fault-tolerance of distributed algorithms
– Neiger, Toueg
- 1990
|
|
47
|
Cynthia Dwork: Randomization in Byzantine Agreement
– Chor
- 1989
|
|
45
|
Fault-tolerance in the advanced automation system
– Cristian, Dancey, et al.
- 1990
|
|
40
|
Bounds on the time to reach agreement in the presence of timing uncertainty
– Attiya, Dwork, et al.
- 1991
|
|
34
|
Achievable cases in an asynchronous environment
– Attiya, Bar-Noy, et al.
- 1987
|
|
34
|
Using failure detectors to solve consensus in asynchronous shared-memory systems
– Lo, Hadzilacos
- 1994
|
|
29
|
A combinatorial characterization of the distributed tasks that are solvable in the presence of one faulty processor
– Biran, Moran, et al.
- 1988
|
|
26
|
Towards Optimal Distributed Consensus
– Berman, Garay, et al.
- 1989
|
|
25
|
Election vs. consensus in asynchronous systems
– Sabel, Marzullo
- 1995
|
|
24
|
Cheating husbands and other stories: a case study of knowledge, action, and communication
– Moses, Dolev, et al.
- 1986
|
|
19
|
Fault-tolerant decision making in totally asynchronous distributed systems
– Bridgland, Watro
- 1987
|
|
18
|
Time and message efficient reliable broadcasts
– Chandra, Toueg
- 1990
|
|
17
|
Reliable scheduling in a TMR database system
– Pittelli, Garcia-Molina
- 1989
|
|
16
|
A new solution for the Byzantine generals problem
– Reischuk
- 1982
|
|
16
|
Shmuel Zaks. A combinatorial characterization of the distributed tasks that are solvable in the presence of one faulty processor
– Biran, Moran
- 1988
|
|
15
|
Impossibility of group membership in asynchronous systems
– Chandra, Hadzilacos, et al.
- 1995
|
|
12
|
Early-delivery atomic broadcast
– Gopal, Strong, et al.
- 1990
|
|
12
|
Failure detectors and the wait-free hierarchy
– Neiger
- 1995
|
|
9
|
The Amoeba Distributed operating system: Selected papers
– Mullender
- 1987
|
|
8
|
Issues in the design of highly available computing services
– Cristian
- 1987
|
|
5
|
Isis - A Distributed Programming Environment
– Birman
- 1990
|
|
3
|
Early-stopping distributed bidding and applications
– Budhiraja, Gopal, et al.
- 1990
|
|
1
|
E-mail correspondence. Showed that 3W cannot be used to solve non-blocking atomic commit
– Chandra, Larrea
- 1994
|
|
1
|
Time and message e#cient reliable broadcasts
– Chandra, Toueg
- 1990
|
|
1
|
Achievable cases in an asynchronous environment
– ATrIYA, BAR-N•, et al.
- 1987
|