Results 1  10
of
31
Unreliable Failure Detectors for Reliable Distributed Systems
 Journal of the ACM
, 1996
"... We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. We characterise unreliable failure detectors in terms of two properties — completeness and accuracy. We show that Consensus can be solved even with ..."
Abstract

Cited by 1094 (19 self)
 Add to MetaCart
We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. We characterise unreliable failure detectors in terms of two properties — completeness and accuracy. We show that Consensus can be solved even with unreliable failure detectors that make an infinite number of mistakes, and determine which ones can be used to solve Consensus despite any number of crashes, and which ones require a majority of correct processes. We prove that Consensus and Atomic Broadcast are reducible to each other in asynchronous systems with crash failures; thus the above results also apply to Atomic Broadcast. A companion paper shows that one of the failure detectors introduced here is the weakest failure detector for solving Consensus [Chandra et al. 1992].
Roundbyround fault detectors: Unifying synchrony and asynchrony
 In Proc of the 17th ACM Symp. Principles of Distributed Computing (PODC
, 1998
"... and insights. 1 Introduction For many years, researchers studying synchronous messagepassing systems have considered algorithms composed of rounds of computation. In each round, a process sends a message to the others and then waits to receive messages from the other processes. The synchronous natu ..."
Abstract

Cited by 55 (9 self)
 Add to MetaCart
(Show Context)
and insights. 1 Introduction For many years, researchers studying synchronous messagepassing systems have considered algorithms composed of rounds of computation. In each round, a process sends a message to the others and then waits to receive messages from the other processes. The synchronous nature of the system ensures that, by the end of the round, each process receives all messages sent to it in that round by correct processes. In the parlance of Elrad and Frances [1] then, each round of a synchronous system is a communicationclosedlayer.
Shared Memory vs Message Passing
, 2004
"... This paper determines the computational strength of the shared memory abstraction (a register) emulated over a message passing system, and compares it with fundamental message passing abstractions like consensus and various forms of reliable broadcast. We introduce ..."
Abstract

Cited by 16 (10 self)
 Add to MetaCart
This paper determines the computational strength of the shared memory abstraction (a register) emulated over a message passing system, and compares it with fundamental message passing abstractions like consensus and various forms of reliable broadcast. We introduce
On failure detectors and type boosters
 In Proceedings of the 17th International Symposium on Distributed Computing (DISC’03
, 2003
"... Abstract. The power of an object type T can be measured as the maximum number n of processes that can solve consensus using only objects of T and registers. This number, denoted cons(T), is called the consensus power of T. This paper addresses the question of the weakest failure detector to solve co ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
(Show Context)
Abstract. The power of an object type T can be measured as the maximum number n of processes that can solve consensus using only objects of T and registers. This number, denoted cons(T), is called the consensus power of T. This paper addresses the question of the weakest failure detector to solve consensus among a number k> n of processes that communicate using shared objects of a type T with consensus power n. In other words, we seek for a failure detector that is sufficient and necessary to “boost ” the consensus power of a type T from n to k. It was shown in [21] that a certain failure detector, denoted Ωn, is sufficient to boost the power of a type T from n to k, and it was conjectured that Ωn was also necessary. In this paper, we prove this conjecture for oneshot deterministic types. We show that, for any oneshot deterministic type T with cons(T) ≤ n, Ωn is necessary to boost the power of T from n to any k> n. Our result generalizes, in a precise sense, the result of the weakest failure detector to solve consensus in asynchronous messagepassing systems [6]. As a corollary of our result, we show that Ωn is the weakest failure detector to boost the resilience of a system of (n − 1)resilient objects of any types and waitfree registers with respect to the consensus problem. 1
On the Weakest Failure Detector Ever
 PODC'07
, 2007
"... Many problems in distributed computing are impossible when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
(Show Context)
Many problems in distributed computing are impossible when no information about process failures is available. It is common to ask what information about failures is necessary and sufficient to circumvent some specific impossibility, e.g., consensus, atomic commit, mutual exclusion, etc. This paper asks what information about failures is needed to circumvent any impossibility and sufficient to circumvent some impossibility. In other words, what is the minimal yet nontrivial failure information. We present an abstraction, denoted Υ, that provides very little failure information. In every run of the distributed system, Υ eventually informs the processes that some set of processes in the system cannot be the set of correct processes in that run. Although seemingly weak, for it might provide random information for an arbitrarily long period
Looking for the Weakest Failure Detector for kSet Agreement in Messagepassing Systems
 Is Πk the End of the Road?, INRIA, 2009, http://hal.inria.fr/inria00384993/en/, PI
, 1929
"... Abstract: In the kset agreement problem, each process (in a set of n processes) proposes a value and has to decide a proposed value in such a way that at most k different values are decided. While this problem can easily be solved in asynchronous systems prone to t process crashes when k> t, it ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Abstract: In the kset agreement problem, each process (in a set of n processes) proposes a value and has to decide a proposed value in such a way that at most k different values are decided. While this problem can easily be solved in asynchronous systems prone to t process crashes when k> t, it cannot be solved when k ≤ t. Since several years, the failure detectorbased approach has been investigated to circumvent this impossibility. While the weakest failure detector class to solve the kset agreement problem in read/write sharedmemory systems has recently been discovered (PODC 2009), the situation is different in messagepassing systems where the weakest failure detector classes are known only for the extreme cases k = 1 (consensus) and k = n − 1 (set agreement). This paper introduces a candidate for the general case. It presents a new failure detector class, denoted Πk, and shows Π1 = Σ × Ω (the weakest class for k = 1), and Πn−1 = L (the weakest class for k = n − 1). Then, the paper investigates the structure of Πk and shows it is the combination of two failures detector classes denoted Σk and Ωk (that generalize the previous “quorums ” and “eventual leaders ” failure detectors classes). Finally, the paper proves that Σk is a necessary requirement (as far as information on failure is concerned) to solve the kset agreement problem in messagepassing systems. The paper presents also a Πn−1based algorithm that solves the (n − 1)set agreement problem. This algorithm provides us with a new algorithmic insight on the way the (n − 1)set agreeement problem can be solved in asynchronous messagepassing systems (insight from the point of view of the nonpartitioning constraint defined by Σn−1).
In search of the holy grail: Looking for the weakest failure detector for waitfree set agreement
, 2006
"... ..."
The Combined Power of Conditions and Information on Failures to Solve Asynchronous Set Agreement
, 2008
"... To cope with the impossibility of solving agreement problems in asynchronous systems made up of n processes and prone to t process crashes, system designers tailor their algorithms to run fast in “normal” circumstances. Two orthogonal notions of “normality” have been studied in the past through fa ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
To cope with the impossibility of solving agreement problems in asynchronous systems made up of n processes and prone to t process crashes, system designers tailor their algorithms to run fast in “normal” circumstances. Two orthogonal notions of “normality” have been studied in the past through failure detectors that give processes information about process crashes, and through conditions that restrict the inputs to an agreement problem. This paper investigates how the two approaches can benefit from each other to solve the kset agreement problem, where processes must agree on at most k of their input values (when k = 1 we have the famous consensus problem). It proposes novel failure detectors for solving kset agreement, and a protocol that combines them with conditions, establishing a new bridge among asynchronous, synchronous and partially synchronous systems with respect to agreement problems. The
Failure Detectors to Solve Asynchronous kSet Agreement: a Glimpse of Recent Results
"... Abstract: In the kset agreement problem, each process proposes a value and has to decide a value in such a way that a decided value is a proposed value and at most k different values are decided. This problem can easily be solved in synchronous systems or in asynchronous systems prone to t process ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract: In the kset agreement problem, each process proposes a value and has to decide a value in such a way that a decided value is a proposed value and at most k different values are decided. This problem can easily be solved in synchronous systems or in asynchronous systems prone to t process crashes when t < k. In contrast, it has been shown that kset agreement cannot be solved in asynchronous systems when k ≤ t. Hence, since several years, the failure detectorbased approach has been investigated to circumvent this impossibility. This approach consists in enriching the underlying asynchronous system with an additional module per process that provides it with information on failures. Hence, without becoming synchronous, the enriched system is no longer fully asynchronous. This paper surveys this approach in both asynchronous shared memory systems and asynchronous message passing systems. It presents and discusses recent results and associated kset agreement algorithms.
Sharing is harder than agreeing
 IN: PODC 2008: PROCEEDINGS OF THE TWENTYSEVENTH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING
, 2008
"... One of the most celebrated results of the theory of distributed computing is the impossibility, in an asynchronous system of n processes that communicate through shared memory registers, to solve the set agreement problem where the processes need to decide on up to n − 1 among their n initial values ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
One of the most celebrated results of the theory of distributed computing is the impossibility, in an asynchronous system of n processes that communicate through shared memory registers, to solve the set agreement problem where the processes need to decide on up to n − 1 among their n initial values. In short, the result indicates that the register abstraction is too weak to implement the set agreement one. This paper explores the relation between these abstractions in a message passing system where a register is not a given physical device but is rather itself implemented by processes communicating through message passing. We show that, maybe surprisingly, the information about process failures that is necessary and sufficient to implement a register shared by two particular processes is sufficient but not necessary to implement set agreement. We later generalize this result by considering kset agreement, where the processes can decide on up to k values, and comparing it with a register shared by any particular subset of 2k processes. We prove that, for 1 ≤ k ≤ n/2, (a) any failure information that is sufficient to implement a register shared by 2k processes is sufficient to implement (n − k)set agreement but (b) a failure information that is sufficient for (n − k)set agreement is not sufficient for a register shared by 2k processes. We also prove that (c) a failure information that is sufficient for a register shared by 2k processes is not sufficient for ((nk)1)set agreement.