Results 1 - 10
of
20
Closure and Convergence: A Foundation of Fault-Tolerant Computing
- IEEE Transactions on Software Engineering
, 1993
"... We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the ..."
Abstract
-
Cited by 103 (28 self)
- Add to MetaCart
We give a formal definition of what it means for a system to "tolerate" a class of "faults". The definition consists of two conditions: One, if a fault occurs when the system state is within a set of "legal" states, the resulting state is within some larger set and, if faults continue occurring, the system state remains within that larger set (Closure). And two, if faults stop occurring, the system eventually reaches a state within the legal set (Convergence). We demonstrate the applicability of our definition for specifying and verifying the fault-tolerance properties of a variety of digital and computer systems. Further, using the definition, we obtain a simple classification of fault-tolerant systems and discuss methods for their systematic design. as traditionally been studied in the context of specifi...
Hundreds of Impossibility Results for Distributed Computing
- Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, fault-tolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract
-
Cited by 32 (4 self)
- Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, fault-tolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.
Conditions on input vectors for consensus solvability in asynchronous distributed systems
- Journal of the ACM
, 2001
"... Abstract. This article introduces and explores the condition-based approach to solve the consensus problem in asynchronous systems. The approach studies conditions that identify sets of input vectors for which it is possible to solve consensus despite the occurrence of up to f process crashes. The f ..."
Abstract
-
Cited by 30 (9 self)
- Add to MetaCart
Abstract. This article introduces and explores the condition-based approach to solve the consensus problem in asynchronous systems. The approach studies conditions that identify sets of input vectors for which it is possible to solve consensus despite the occurrence of up to f process crashes. The first main result defines acceptable conditions and shows that these are exactly the conditions for which a consensus protocol exists. Two examples of realistic acceptable conditions are presented, and proved to be maximal, in the sense that they cannot be extended and remain acceptable. The second main result is a generic consensus shared-memory protocol for any acceptable condition. The protocol always guarantees agreement and validity, and terminates (at least) when the inputs satisfy the condition with which the protocol has been instantiated, or when there are no crashes. An efficient version of the protocol is then designed for the message passing model that works when f < n/2, and it is shown that no such protocol exists when f ≥ n/2. It is also shown how the protocol’s safety can be traded for its liveness.
Solving vector consensus with a wormhole
- IEEE Transactions on Parallel and Distributed Systems
, 2005
"... Abstract—This paper presents a solution to the vector consensus problem for Byzantine asynchronous systems augmented with wormholes. Wormholes prefigure a hybrid distributed system model, embodying the notion of an enhanced part of the system with “good ” properties otherwise not guaranteed by the “ ..."
Abstract
-
Cited by 12 (9 self)
- Add to MetaCart
Abstract—This paper presents a solution to the vector consensus problem for Byzantine asynchronous systems augmented with wormholes. Wormholes prefigure a hybrid distributed system model, embodying the notion of an enhanced part of the system with “good ” properties otherwise not guaranteed by the “normal ” weak environment. A protocol built for this type of system runs in the asynchronous part, where f out of n 3fþ 1 processes might be corrupted by malicious adversaries. However, sporadically, processes can rely on the services provided by the wormhole for the correct execution of simple operations. One of the nice features of this setting is that it is possible to keep the protocol completely time-free and, in addition, to circumvent the FLP impossibility result by hiding all time-related assumptions in the wormhole. Furthermore, from a performance perspective, it leads to the design of a protocol with a good time complexity. Index Terms—Distributed systems, Byzantine asynchronous protocols, consensus. 1
The Need for Headers: An Impossibility Result for Communication over Unreliable Channels
, 1990
"... It is proved that any protocol that constructs a reliable data link service using a physical channel service necessarily includes in the packets some header information that enables the protocol to treat different pckets differently. The physical channel considered is permitted to lose, but not reor ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
It is proved that any protocol that constructs a reliable data link service using a physical channel service necessarily includes in the packets some header information that enables the protocol to treat different pckets differently. The physical channel considered is permitted to lose, but not reorder or duplicate packets. The formal framework used for the proof is the I/O automaton model.
Distributed computing with advice: Information sensitivity of graph coloring
- IN 34TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES AND PROGRAMMING (ICALP
, 2007
"... We study the problem of the amount of information (advice) about a graph that must be given to its nodes in order to achieve fast distributed computations. The required size of the advice enables to measure the information sensitivity of a network problem. A problem is information sensitive if litt ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
We study the problem of the amount of information (advice) about a graph that must be given to its nodes in order to achieve fast distributed computations. The required size of the advice enables to measure the information sensitivity of a network problem. A problem is information sensitive if little advice is enough to solve the problem rapidly (i.e., much faster than in the absence of any advice), whereas it is information insensitive if it requires giving a lot of information to the nodes in order to ensure fast computation of the solution. In this paper, we study the information sensitivity of distributed graph coloring.
Lower Bounds in Distributed Computing
, 2000
"... This paper discusses results that say what cannot be computed in certain environments or when insucient resources are available. A comprehensive survey would require an entire book. As in Nancy Lynch's excellent 1989 paper, \A Hundred Impossibility Proofs for Distributed Computing" [86], we shall re ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This paper discusses results that say what cannot be computed in certain environments or when insucient resources are available. A comprehensive survey would require an entire book. As in Nancy Lynch's excellent 1989 paper, \A Hundred Impossibility Proofs for Distributed Computing" [86], we shall restrict ourselves to some of the results we like best or think are most important. Our aim is to give you the avour of the results and some of the techniques that have been used. We shall also mention some interesting open problems and provide an extensive list of references. The focus will be on results from the past decade.
Operation-valency and the cost of coordination
- In Proceedings of the 22nd Annual ACM Symposium on Principles of Distributed Computing (PODC
, 2003
"... This paper introduces operation-valency, a generalization of the valency proof technique originated by Fischer, Lynch, and Paterson. By focusing on critical events that influ-ence the return values of individual operations rather then on critical events that influence a protocol's single return valu ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
This paper introduces operation-valency, a generalization of the valency proof technique originated by Fischer, Lynch, and Paterson. By focusing on critical events that influ-ence the return values of individual operations rather then on critical events that influence a protocol's single return value, the new technique allows us to derive a collection of realistic lower bounds for lock-free implementations of concurrent objects such as linearizable queues, stacks, sets, hash tables, shared counters, approximate agreement, and more. By realistic we mean that they follow the real-world model introduced by Dwork, Herlihy, and Waarts, count-ing both memory-references and memory-stalls due to con-tention, and that they allow the combined use of read, write, and read-modify-write operations available on current ma-chines. By using the operation-valency technique, we derive an f~(X/~) non-cached shared memory accesses lower bound on the worst-case time complexity of lock-free implementations of objects in Influence(n), a wide class of concurrent objects including all of those mentioned above, in which an individ-ual operation can be influenced by all others. We also prove the existence of a fundamental relationship between the space complexity, latency, contention, and "in-fluence level " of any lock-free object implementation. Our results are broad in that they hold for implementations com-bining read/write memory and any collection of read-modify-write operations, and in that they apply even if shared mem-ory words have unbounded size.
A distributed protocol for dynamic address assignment in mobile ad hoc networks
- IEEE Trans. Mobile Computing
, 2006
"... A Mobile Ad hoc NETwork (MANET) is a group of mobile nodes that form a multi-hop wireless network. The topology of the network can change randomly due to unpredictable mobility of nodes and propagation characteristics. Previously, it was assumed that the nodes in the network were assigned IP address ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
A Mobile Ad hoc NETwork (MANET) is a group of mobile nodes that form a multi-hop wireless network. The topology of the network can change randomly due to unpredictable mobility of nodes and propagation characteristics. Previously, it was assumed that the nodes in the network were assigned IP addresses a priori. This may not be feasible as nodes can enter and leave the network dynamically. A dynamic IP address assignment protocol like DHCP requires centralized servers that may not be present in MANETs. Hence, we propose a distributed protocol for dynamic IP address assignment to nodes in MANETs. The proposed solution guarantees unique IP address assignment under a variety of network conditions including message losses, network partitioning and merging. Simulation results show that the protocol incurs low latency and communication overhead for an IP address assignment. MANET, address allocation, IP-networks. Index Terms
On The Message Complexity Of Binary Byzantine Agreement Under Crash Failures
- Distributed Computing
, 1992
"... The binary Byzantine Agreement problem requires n \Gamma 1 receivers to agree on the binary value broadcast by a sender even when some of these n processes may be faulty. We investigate the message complexity of protocols that solve this problem in the case of crash failures. In particular, we deriv ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The binary Byzantine Agreement problem requires n \Gamma 1 receivers to agree on the binary value broadcast by a sender even when some of these n processes may be faulty. We investigate the message complexity of protocols that solve this problem in the case of crash failures. In particular, we derive matching upper and lower bounds on the total, worst and average case number of messages needed in the failure-free executions of such protocols. More specifically, we prove that any protocol that tolerates up to t faulty processes requires a total of at least n + t \Gamma 1 messages in its failure-free executions --- and, therefore, at least d(n + t \Gamma 1)=2e messages in the worst case and min(P 0 ; P 1 ) \Delta (n+t \Gamma 1) messages in the average case, where P v is the probability that the value of the bit that the sender wants to broadcast is v. We also give protocols that solve the problem using only the minimum number of messages for these three complexity measures. These protoc...

