Results 1 - 10
of
57
Small Byzantine Quorum Systems
- DISTRIBUTED COMPUTING
, 2001
"... In this paper we present two protocols for asynchronous Byzantine Quorum Systems (BQS) built on top of reliable channels---one for self-verifying data and the other for any data. Our protocols tolerate Byzantine failures with fewer servers than existing solutions by eliminating nonessential work in ..."
Abstract
-
Cited by 366 (48 self)
- Add to MetaCart
In this paper we present two protocols for asynchronous Byzantine Quorum Systems (BQS) built on top of reliable channels---one for self-verifying data and the other for any data. Our protocols tolerate Byzantine failures with fewer servers than existing solutions by eliminating nonessential work in the write protocol and by using read and write quorums of different sizes. Since engineering a reliable network layer on an unreliable network is difficult, two other possibilities must be explored. The first is to strengthen the model by allowing synchronous networks that use time-outs to identify failed links or machines. We consider running synchronous and asynchronous Byzantine Quorum protocols over synchronous networks and conclude that, surprisingly, "self-timing" asynchronous Byzantine protocols may offer significant advantages for many synchronous networks when network time-outs are long. We show how to extend an existing Byzantine Quorum protocol to eliminate its dependency on reliable networking and to handle message loss and retransmission explicitly.
Atomic Snapshots of Shared Memory
, 1993
"... . This paper introduces a general formulation of atomic snapshot memory, a shared memory partitioned into words written (updated) by individual processes, or instantaneously read (scanned) in its entirety. This paper presents three wait-free implementations of atomic snapshot memory. The first imple ..."
Abstract
-
Cited by 148 (42 self)
- Add to MetaCart
. This paper introduces a general formulation of atomic snapshot memory, a shared memory partitioned into words written (updated) by individual processes, or instantaneously read (scanned) in its entirety. This paper presents three wait-free implementations of atomic snapshot memory. The first implementation in this paper uses unbounded (integer) fields in these registers, and is particularly easy to understand. The second implementation uses bounded registers. Its correctness proof follows the ideas of the unbounded implementation. Both constructions implement a single-writer snapshot memory, in which each word may be updated by only one process, from single-writer, n-reader registers. The third algorithm implements a multi-writer snapshot memory from atomic n-writer, n-reader registers, again echoing key ideas from the earlier constructions. All operations require \Theta(n 2 ) reads and writes to the component shared registers in the worst case. Categories and Subject Discriptors:...
The asynchronous computability theorem for tresilient tasks
- In Proceedings of the 1993 ACM Symposium on Theory of Computing
, 1993
"... We give necessary and sufficient combinatorial conditions characterizing the computational tasks that can be solved by N asynchronous processes, up to t of which can fail by halting. The range of possible input and output values for an asynchronous task can be associated with a high-dimensional geom ..."
Abstract
-
Cited by 95 (14 self)
- Add to MetaCart
We give necessary and sufficient combinatorial conditions characterizing the computational tasks that can be solved by N asynchronous processes, up to t of which can fail by halting. The range of possible input and output values for an asynchronous task can be associated with a high-dimensional geometric structure called a simplicial complex. Our main theorem characterizes computability y in terms of the topological properties of this complex. Most notably, a given task is computable only if it can be associated with a complex that is simply connected with trivial homology groups. In other words, the complex has “no holes!” Applications of this characterization include the first impossibility results for several long-standing open problems in distributed computing, such as the “renaming ” problem of Attiya et. al., the “k-set agreement ” problem of Chaudhuri, and a generalization of the approximate agreement problem. 1
The Topological Structure of Asynchronous Computability
- JOURNAL OF THE ACM
, 1996
"... We give necessary and sufficient combinatorial conditions characterizing the tasks that can be solved by asynchronous processes, of which all but one can fail, that communicate by reading and writing a shared memory. We introduce a new formalism for tasks, based on notions from classical algebra ..."
Abstract
-
Cited by 94 (11 self)
- Add to MetaCart
We give necessary and sufficient combinatorial conditions characterizing the tasks that can be solved by asynchronous processes, of which all but one can fail, that communicate by reading and writing a shared memory. We introduce a new formalism for tasks, based on notions from classical algebraic and combinatorial topology, in which a task's possible input and output values are each associated with high-dimensional geometric structures called simplicial complexes. We characterize computability in terms of the topological properties of these complexes. This characterization has a surprising geometric interpretation: a task is solvable if and only if the complex representing the task's allowable inputs can be mapped to the complex representing the task's allowable outputs by a function satisfying certain simple regularity properties. Our formalism thus replaces the "operational" notion of a wait-free decision task, expressed in terms of interleaved computations unfolding ...
Wait-Free Data Structures in the Asynchronous PRAM Model
- In Proceedings of the 2nd Annual Symposium on Parallel Algorithms and Architectures
, 2000
"... In the asynchronous PRAM model, processes communicate by atomically reading and writing shared memory locations. This paper investigates the extent to which asynchronous PRAM permits long-lived, highly concurrent data structures. An implementation of a concurrent object is wait-free if every operati ..."
Abstract
-
Cited by 62 (11 self)
- Add to MetaCart
In the asynchronous PRAM model, processes communicate by atomically reading and writing shared memory locations. This paper investigates the extent to which asynchronous PRAM permits long-lived, highly concurrent data structures. An implementation of a concurrent object is wait-free if every operation will complete in a finite number of steps, and it is k-bounded wait-free, for some k > 0, if every operation will complete within k steps. In the first part of this paper, we show that there are objects with wait-free implementations but no k-bounded wait-free implementations for any k, and that there is an infinite hierarchy of objects with implementations that are k-bounded wait-free but not K-bounded wait-free for some K > k. In the second part of the paper, we give an algebraic characterization of a large class of objects that do have wait-free implementations in asynchronous PRAM, as well as a general algorithm for implementing them. Our tools include simple iterative algorithms for wait-free approximate agreement and atomic snapshot.
Real-Time Computing with Lock-Free Shared Objects
- ACM Transactions on Computer Systems
, 1995
"... This paper considers the use of lock-free shared objects within hard real-time systems. As the name suggests, lock-free shared objects are distinguished by the fact that they are not locked. As such, they do not give rise to priority inversions, a key advantage over conventional, lock-based object-s ..."
Abstract
-
Cited by 50 (7 self)
- Add to MetaCart
This paper considers the use of lock-free shared objects within hard real-time systems. As the name suggests, lock-free shared objects are distinguished by the fact that they are not locked. As such, they do not give rise to priority inversions, a key advantage over conventional, lock-based object-sharing approaches. Despite this advantage, it is not immediately apparent that lock-free shared objects can be employed if tasks must adhere to strict timing constraints. In particular, lock-free object implementations permit concurrent operations to interfere with each other, and repeated interferences can cause a given operation to take an arbitrarily long time to complete. The main contribution of this paper is to show that such interferences can be bounded by judicious scheduling. This work pertains to periodic, hard real-time tasks that sharelock-free objects on a uniprocessor. In the first part of the paper, scheduling conditions are derived for such tasks, for both static and dynamic pri...
Time- and Space-Efficient Randomized Consensus
- Journal of Algorithms
, 1992
"... A protocol is presented which solves the randomized consensus problem[9] for shared memory. The protocol uses a total of O(p 2 +n) worst-case expected increment, decrement and read operations on a set of three shared O(logn)-bit counters, where p is the number of active processors and n is the ..."
Abstract
-
Cited by 42 (12 self)
- Add to MetaCart
A protocol is presented which solves the randomized consensus problem[9] for shared memory. The protocol uses a total of O(p 2 +n) worst-case expected increment, decrement and read operations on a set of three shared O(logn)-bit counters, where p is the number of active processors and n is the total number of processors. It requires less space than previous polynomial-time consensus protocols[6, 7], and is faster when not all of the processors participate in the protocol. A modified version of the protocol yields a weak shared coin whose bias is guaranteed to be in the range 1=2 \Sigma ffl regardless of scheduler behavior, and which is the first such protocol for the shared-memory model to guarantee that all processors agree on the outcome of the coin. 1 1.
Are Wait-Free Algorithms Fast?
, 1991
"... The time complexity of wait-free algorithms in "normal" executions, where no failures occur and processes operate at approximately the same speed, is considered. A lower bound of log n on the time complexity of any wait-free algorithm that achieves approximate agreement among n processes is proved. ..."
Abstract
-
Cited by 42 (12 self)
- Add to MetaCart
The time complexity of wait-free algorithms in "normal" executions, where no failures occur and processes operate at approximately the same speed, is considered. A lower bound of log n on the time complexity of any wait-free algorithm that achieves approximate agreement among n processes is proved. In contrast, there exists a non-wait-free algorithm that solves this problem in constant time. This implies an (log n) time separation between the wait-free and non-wait-free computation models. On the positive side, we present an O(log n) time wait-free approximate agreement algorithm; the complexity of this algorithm is within a small constant of the lower bound.
How to Share Concurrent Wait-Free Variables
, 1995
"... Sharing data between multiple asynchronous users---each of which can atomically read and write the data---is a feature which may help to increase the amount of parallelism in distributed systems. An algorithm implementing this feature is presented. The main construction of an n-user atomic variable ..."
Abstract
-
Cited by 39 (8 self)
- Add to MetaCart
Sharing data between multiple asynchronous users---each of which can atomically read and write the data---is a feature which may help to increase the amount of parallelism in distributed systems. An algorithm implementing this feature is presented. The main construction of an n-user atomic variable directly from single-writer, single-reader atomic variables uses O(n) control bits and O(n) accesses per Read/Write running in O(1) parallel time.
Hundreds of Impossibility Results for Distributed Computing
- Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, fault-tolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract
-
Cited by 32 (4 self)
- Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, fault-tolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.

