Results 1 -
7 of
7
Impossibility of Distributed Consensus with One Faulty Process
, 1985
"... The consensus problem involves an asynchronous system of processes, some of which may be unreliable. The problem is for the reliable processes to agree on a binary value. We show 'that every protocol for this problem has the possibility of nontermination, even with only one faulty process. By way of ..."
Abstract
-
Cited by 1300 (28 self)
- Add to MetaCart
The consensus problem involves an asynchronous system of processes, some of which may be unreliable. The problem is for the reliable processes to agree on a binary value. We show 'that every protocol for this problem has the possibility of nontermination, even with only one faulty process. By way of contrast, solutions are known for the synchronous case, the "Byzantine Generals" problem.
The Consensus Problem in Unreliable Distributed Systems (A Brief Survey)
, 2000
"... Agreement problems involve a system of processes, some of which may be faulty. A fundamental problem of fault-tolerant distributed computing is for the reliable processes to reach a consensus. We survey the considerable literature on this problem that has developed over the past few years and giv ..."
Abstract
-
Cited by 102 (2 self)
- Add to MetaCart
Agreement problems involve a system of processes, some of which may be faulty. A fundamental problem of fault-tolerant distributed computing is for the reliable processes to reach a consensus. We survey the considerable literature on this problem that has developed over the past few years and give an informal overview of the major theoretical results in the area.
Bounds on the Time to Reach Agreement in the Presence of Timing Uncertainty (Extended Abstract)
, 1991
"... Upper and lower bounds are proved for the real time complexity of the problem of reaching agreement in a distributed network, in the presence of process failures and inexact information about time. It is assumed that the amount of (real) time between any two consecutive steps of any nonfaulty proces ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
Upper and lower bounds are proved for the real time complexity of the problem of reaching agreement in a distributed network, in the presence of process failures and inexact information about time. It is assumed that the amount of (real) time between any two consecutive steps of any nonfaulty process is at least c1 and at most c2; thus, C = c2/c1 is a measure of the timing uncertainty. It is also assumed that the time for message delivery is at most d. Processes are assumed to fail by stopping, so that process failures can be detected by timeouts. Let T denote...
Method for Distributed Transaction Commit and Recovery Using Byzantine Agreement Within Clusters of Processors
- In Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing
, 1983
"... to distributed transaction commit. We replace the second phase of one of the commit algorithms of [MoLi83] with Byzantine Agreement, providing certain trade-offs and advantages at the time of commit and providing speed advantages at the time of recovery from failure. The present work differs from th ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
to distributed transaction commit. We replace the second phase of one of the commit algorithms of [MoLi83] with Byzantine Agreement, providing certain trade-offs and advantages at the time of commit and providing speed advantages at the time of recovery from failure. The present work differs from that presented in [DoSt82b] by increasing the scope (handing a general tree of processes, and multi-cluster transactions) and by providing an explicit set of recovery algorithms. We also provide a model for classifying failures that allows comparisons to be made among various proposed distributed commit algorithms. The context for our work is the Highly Available Systems project at the IBM San Jose Research Laboratory [AAF-KM83].
The Distributed Firing Squad Problem
, 1985
"... this paper we justify the design assumption of simultaneous starts. Specifically, we provide algorithms to solve the associated synchronization problem, which we call the distributed firing squad problem (abbreviated DFS). A distributed algorithm for the DFS problem has two properties: (I) if any c ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
this paper we justify the design assumption of simultaneous starts. Specifically, we provide algorithms to solve the associated synchronization problem, which we call the distributed firing squad problem (abbreviated DFS). A distributed algorithm for the DFS problem has two properties: (I) if any correct processor receives a .message to start a DFS synchronization, then at some future time all cor- rect processors will "fire" (formally, enter a special state), and (2) the correct processors all fire at exactly the same step
On the improbability of reaching Byzantine agreements
- In Proceedings of the 21st Annual ACM Symposium on the Theory of Computing
, 1989
"... Abstract. It is well known that for the Byzantine Generals Problem, no deterministic protocol can exist for an n-processor system if the number t of faulty processors is allowed to be as large as ni3. In this paper we investigate the maximum achievable agreement probability p,,, in a model in which ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Abstract. It is well known that for the Byzantine Generals Problem, no deterministic protocol can exist for an n-processor system if the number t of faulty processors is allowed to be as large as ni3. In this paper we investigate the maximum achievable agreement probability p,,, in a model in which the faulty processors can be as devious and powerful as possible. We also discuss a restricted model with pk, denoting the corresponding maximum achievable probability. We will prove that: (i) for n =3, t=l (the first nontrivial case), & = (G-l)/2 (the reciprocal of the golden ratio); (ii) for all E with O<c<l,if’>l- 1op(1-E)1’2 n lo8(1-(1-E>“Z>
Cynthia Dwork And Nancy Lynch
- Journal of the ACM
, 1988
"... The concept of partial synchrony in a distributed system is introduced. Partial synchrony lies between the cases of a synchronous system and an asynchronous system. In a synchronous system, there is a known fixed upper bound A on the time required for a message to be sent from one processor to ano ..."
Abstract
- Add to MetaCart
The concept of partial synchrony in a distributed system is introduced. Partial synchrony lies between the cases of a synchronous system and an asynchronous system. In a synchronous system, there is a known fixed upper bound A on the time required for a message to be sent from one processor to another and a known fixed upper bound (I, on the relative speeds of different processors. In an asynchronous system no fixed upper bounds A and (I, exist. In one version of partial synchrony, fixed bounds A and (I, exist, but they are not known a priori. The problem is to design protocols that work correctly in the partially synchronous system regardless of the actual values of the bounds A and (I,. In another version of partial synchrony, the bounds are known, but are only guaranteed to hold starting at some unknown time T, and protocols must be designed to work correctly regardless of when time T occurs. Fault-tolerant consensus protocols are given for various cases of partial synchrony and various fault models. Lower bounds that show in most cases that our protocols are optimal with respect to the number of faults tolerated are also given. Our consensus protocols for partially synchronous processors use new protocols for fault-tolerant "distributed clocks" that allow partially synchronous processors to reach some approximately common notion of time.

