## Robust gossiping with an application to consensus

### Cached

### Download Links

- [carbon.ucdenver.edu]
- [carbon.cudenver.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | Journal of Computer and System Sciences |

Citations: | 9 - 5 self |

### BibTeX

@ARTICLE{Chlebus_robustgossiping,

author = {Bogdan S. Chlebus and Dariusz R. Kowalski},

title = {Robust gossiping with an application to consensus},

journal = {Journal of Computer and System Sciences},

year = {},

pages = {1262--1281}

}

### OpenURL

### Abstract

We study deterministic gossiping in synchronous systems with dynamic crash failures. Each processor is initialized with an input value called rumor. In the standard gossip problem, the goal of every processor is to learn all the rumors. When processors may crash, then this goal needs to be revised, since it is possible, at a point in an execution, that certain rumors are known only to processors that have already crashed. We define gossiping to be completed, for a system with crashes, when every processor knows either the rumor of processor v or that v has already crashed, for any processor v. We design gossiping algorithms that are efficient with respect to both time and communication. Let t < n be the number of failures, where n is the number of processors. If n − t = Ω(n/polylog n), then one of our algorithms completes gossiping in O(log 2 t) time and with O(n polylog n) messages. We develop an algorithm that performs gossiping with O(n 1.77) messages and in O(log 2 n) time, in any execution in which at least one processor remains non-faulty. We show a trade-off between time and communication in gossiping algorithms: if the number of messages is at most O(n polylog n), then the time has to be at least Ω ( log n. By way of application, we show that if n − t = Ω(n), then log(n log n)−log t consensus can be solved in O(t) time and with O(n log 2 t) messages.

### Citations

1504 | Impossibility of Distributed Consensus with One Faulty Process
- Fischer, Lynch, et al.
- 1985
(Show Context)
Citation Context ...tak and Lamport [32]. They showed [29, 32] that number t of faulty processors needs to be smaller than n/3 for a solution to exist, assuming synchrony and Byzantine faults. Fisher, Lynch and Paterson =-=[14]-=- showed that the problem is unsolvable in the asynchronous message-passing setting, even with only one crash failure. Relevance of the consensus problem to fault-tolerant broadcast and other communica... |

1302 | The Byzantine generals problem
- Lamport, Shostak, et al.
- 1982
(Show Context)
Citation Context ... [6]. See a survey by Pelc [33] for more on the previous research on fault-tolerant broadcasting and gossiping. The problem of consensus was introduced by Pease, Shostak and Lamport [32]. They showed =-=[29, 32]-=- that number t of faulty processors needs to be smaller than n/3 for a solution to exist, assuming synchrony and Byzantine faults. Fisher, Lynch and Paterson [14] showed that the problem is unsolvable... |

738 |
Epidemic algorithms for replicated database maintenance
- Demers, Greene, et al.
- 1988
(Show Context)
Citation Context ...consider dissemination of information in a network similarly as spreading of a rumor or of an infectious disease in a group of people, which has been studied in applied mathematics [4]. Demers et al. =-=[9]-=- introduced so-called epidemic algorithms for updating data bases, in which a processor regularly chooses other processors at random and transmits the rumors; see also the paper by Agraval, Abbadi and... |

547 | Reaching agreement in the presence of faults
- Pease, Shostak, et al.
- 1980
(Show Context)
Citation Context ...bus, Diks and Pelc [6]. See a survey by Pelc [33] for more on the previous research on fault-tolerant broadcasting and gossiping. The problem of consensus was introduced by Pease, Shostak and Lamport =-=[32]-=-. They showed [29, 32] that number t of faulty processors needs to be smaller than n/3 for a solution to exist, assuming synchrony and Byzantine faults. Fisher, Lynch and Paterson [14] showed that the... |

320 | Ramanujan graphs
- Lubotzky, Phillips, et al.
- 1988
(Show Context)
Citation Context ...etween them in G of length at most k. We use specific graphs G and G k that have special properties. Namely, graphs G are based on constructive Ramanujan graphs given by Lubotzky, Phillips and Sarnak =-=[30]-=-, which are expanders. For a positive integer m, graph G(m) denotes such a graph G with exactly m processors. Technically, G(m) simulates the corresponding Ramanujan graph of Θ(m) processors, see [7, ... |

318 |
Fault-tolerant broadcasts and related problems
- Hadzilacos, Toueg
- 1993
(Show Context)
Citation Context ...ronous message-passing setting, even with only one crash failure. Relevance of the consensus problem to fault-tolerant broadcast and other communication problems was discussed by Hadzilacos and Toueg =-=[20]-=-. Fisher and Lynch [13] showed that a synchronous solution to consensus requires t+1 rounds, when t is the tolerable number of failures. Garay and Moses [17] developed an algorithm with polynomial-siz... |

268 |
The Mathematical Theory of Infectious Diseases and its Applications, 2nd ed
- Bailey
- 1975
(Show Context)
Citation Context ...and Unger. One may consider dissemination of information in a network similarly as spreading of a rumor or of an infectious disease in a group of people, which has been studied in applied mathematics =-=[4]-=-. Demers et al. [9] introduced so-called epidemic algorithms for updating data bases, in which a processor regularly chooses other processors at random and transmits the rumors; see also the paper by ... |

225 | A gossip-style failure detection service
- Renesse, Minsky, et al.
- 1998
(Show Context)
Citation Context ...th Byzantine node failures, in the case when nodes can test other nodes. Application of gossiping to gathering information about occurrences of failures was proposed by van Renesse, Minsky and Hayden =-=[35]-=-. Gossiping with transient stochastic link failures and permanent stochastic node failures was considered by Chlebus, Diks and Pelc [6]. See a survey by Pelc [33] for more on the previous research on ... |

200 | Randomized rumor spreading
- Karp, Schindelhauer, et al.
(Show Context)
Citation Context ...e also the paper by Agraval, Abbadi and Steinke [1] for recent work in this direction. Such randomized epidemic algorithms have been systematically studied by Karp, Schindelhauer, Shenker and Vöcking =-=[25]-=-. The problem of exchange of information when nodes do not initially know each other was considered by Harchol-Balter, Leighton and Lewin [21]; in their solution the processors learn about each other ... |

146 | Spatial gossip and resource location protocols
- Kempe, Kleinberg, et al.
- 2001
(Show Context)
Citation Context ...lution the processors learn about each other in the course of gossiping. Gossip-style algorithms to have processors learn about the nearest resource location were given by Kempe, Kleinberg and Demers =-=[27]-=- and Kempe and Kleinberg [26]. In the prior research on the gossip problem in a failure-prone environment, either link failures or processor failures controlled by oblivious adversaries have been cons... |

138 | A lower bound for the time to assure interactive consistency
- Fischer, Lynch
- 1982
(Show Context)
Citation Context ...setting, even with only one crash failure. Relevance of the consensus problem to fault-tolerant broadcast and other communication problems was discussed by Hadzilacos and Toueg [20]. Fisher and Lynch =-=[13]-=- showed that a synchronous solution to consensus requires t+1 rounds, when t is the tolerable number of failures. Garay and Moses [17] developed an algorithm with polynomial-size messages and operatin... |

107 | Methods and problems of communication in usual networks
- Fraigniaud, Lazard
- 1990
(Show Context)
Citation Context ... time, while in the latter case messages to all neighbors can be sent and received concurrently. For surveys of such an approach to broadcasting and gossiping, see the papers by Fraigniaud and Lazard =-=[15]-=-, Hedetniemi, Hedetniemi and Liestman [22], Hromkovič, Klasing, Monien and Peine [23], and the book [24] by Hromkovič, Klasing, Pelc, Ruzicka and Unger. One may consider dissemination of information i... |

100 | Dissemination of information in interconnection networks (broadcasting & gossiping
- Hromkovič, Klasing, et al.
- 1996
(Show Context)
Citation Context ...ncurrently. For surveys of such an approach to broadcasting and gossiping, see the papers by Fraigniaud and Lazard [15], Hedetniemi, Hedetniemi and Liestman [22], Hromkovič, Klasing, Monien and Peine =-=[23]-=-, and the book [24] by Hromkovič, Klasing, Pelc, Ruzicka and Unger. One may consider dissemination of information in a network similarly as spreading of a rumor or of an infectious disease in a group ... |

95 |
Probability and Computing
- MITZENMACHER, UPFAL
- 2005
(Show Context)
Citation Context ...d by range messages sent by processor p. In the proof of this fact, an estimate on the probability of deviation from the expected number of successes in a sequence of Bernoulli trials is applied, see =-=[31]-=- for a overview of related topics. This standard estimate is known as the Chernoff bound and is as follows: if Bk is the number of successes in k Bernoulli trials, each with probability x of success, ... |

78 | Resource Discovery in Distributed Networks
- Harchol-Balter, Leighton, et al.
(Show Context)
Citation Context ...cally studied by Karp, Schindelhauer, Shenker and Vöcking [25]. The problem of exchange of information when nodes do not initially know each other was considered by Harchol-Balter, Leighton and Lewin =-=[21]-=-; in their solution the processors learn about each other in the course of gossiping. Gossip-style algorithms to have processors learn about the nearest resource location were given by Kempe, Kleinber... |

73 |
Epidemic algorithms in replicated databases
- AGRAWAL, ABBADI, et al.
- 1997
(Show Context)
Citation Context ...ed so-called epidemic algorithms for updating data bases, in which a processor regularly chooses other processors at random and transmits the rumors; see also the paper by Agraval, Abbadi and Steinke =-=[1]-=- for recent work in this direction. Such randomized epidemic algorithms have been systematically studied by Karp, Schindelhauer, Shenker and Vöcking [25]. The problem of exchange of information when n... |

66 | Fully polynomial Byzantine agreement for n > 3t processors in t+ 1 rounds
- Garay, Moses
- 1998
(Show Context)
Citation Context ...lems was discussed by Hadzilacos and Toueg [20]. Fisher and Lynch [13] showed that a synchronous solution to consensus requires t+1 rounds, when t is the tolerable number of failures. Garay and Moses =-=[17]-=- developed an algorithm with polynomial-size messages and operating in t + 1 rounds, for n > 3t processors subject to Byzantine failures. The message complexity of consensus, when no failures actually... |

65 |
Elementary number theory, group theory, and Ramanujan graphs
- Davidoff, Sarnak, et al.
- 2003
(Show Context)
Citation Context .... For expander G(m) to exist, number ∆0 needs to have certain number-theoretic properties; in this paper we set ∆0 = 74. The diameter of G(m) is O(log m). See the book by Davidoff, Sarnak and Valette =-=[8]-=- for an exposition of the construction of Ramanujan graphs and a discussion of their properties. In Definition 1, and elsewhere, notation ln x means the natural logarithm of x. Notation lg x denotes t... |

57 | Protocols and Impossibility Results for Gossip-Based Communication Mechanisms
- Kempe, Kleinberg
- 2002
(Show Context)
Citation Context ...bout each other in the course of gossiping. Gossip-style algorithms to have processors learn about the nearest resource location were given by Kempe, Kleinberg and Demers [27] and Kempe and Kleinberg =-=[26]-=-. In the prior research on the gossip problem in a failure-prone environment, either link failures or processor failures controlled by oblivious adversaries have been considered. Permanent link failur... |

46 | Performing work efficiently in the presence of faults
- Dwork, Halpern, et al.
- 1998
(Show Context)
Citation Context ...ally different for crash failures. For a long time, only a trivial linear lower bound Ω(n) on the number of messages has been known and the issue of its optimality was open. Dwork, Halpern and Waarts =-=[11]-=- found a solution with O(n log n) messages but with an exponential time. Finally, Galil, Mayer and Yung [16] developed an algorithm with O(n) messages, thus showing that this amount of messages is opt... |

39 |
Bounds on information exchange for Byzantine Agreement
- Dolev, Reischuk
- 1985
(Show Context)
Citation Context ...ject to Byzantine failures. The message complexity of consensus, when no failures actually occur, was studied by Amdur, Weber and Hadzilacos [2] and by Hadzilacos and Halpern [19]. Dolev and Reischuk =-=[10]-=- studied the message complexity of consensus in the case of Byzantine faults. They distinguished between pure Byzantine faults and a less demanding situation when some (cryptographic) authentication m... |

39 |
Fault-tolerant broadcasting and gossiping in communication networks, Département d’Informatique, Université du Québec à
- Pelc
- 1996
(Show Context)
Citation Context ...sed by van Renesse, Minsky and Hayden [35]. Gossiping with transient stochastic link failures and permanent stochastic node failures was considered by Chlebus, Diks and Pelc [6]. See a survey by Pelc =-=[33]-=- for more on the previous research on fault-tolerant broadcasting and gossiping. The problem of consensus was introduced by Pease, Shostak and Lamport [32]. They showed [29, 32] that number t of fault... |

37 | Resolving message complexity of byzantine agreement and beyond
- Galil, Mayer, et al.
- 1995
(Show Context)
Citation Context ...messages has been known and the issue of its optimality was open. Dwork, Halpern and Waarts [11] found a solution with O(n log n) messages but with an exponential time. Finally, Galil, Mayer and Yung =-=[16]-=- developed an algorithm with O(n) messages, thus showing that this amount of messages is optimal. The drawback of their solution is that it runs in overlinear time O(n 1+ε ), for any fixed 0 < ε < 1. ... |

36 |
Fault tolerance in networks of bounded degree
- Dwork, Pippenger
- 1988
(Show Context)
Citation Context ...l [34] showed how an almost-everywhere agreement can be achieved with a linear number of faults in networks of bounded degree, which strengthened a related result by Dwork, Peleg, Pippenger and Upfal =-=[12]-=-. This approach, to use networks of bounded degree and high connectivity to obtain fault-tolerance, was extended by Chlebus, Gasieniec, ↩ Kowalski and Shvartsman [7] in their work on the problem to pe... |

29 |
Tolerating a linear number of faults in networks of bounded degree
- Upfal
- 1992
(Show Context)
Citation Context ...is O(s + 1), where s is the actual number of failures in an execution. Galil, Mayer and Yung [16] found an early-stopping solution with O(n + sn ε ) communication complexity, for any 0 < ε < 1. Upfal =-=[34]-=- showed how an almost-everywhere agreement can be achieved with a linear number of faults in networks of bounded degree, which strengthened a related result by Dwork, Peleg, Pippenger and Upfal [12]. ... |

27 | Message-optimal protocols for Byzantine agreement
- Hadzilacos, Halpern
- 1993
(Show Context)
Citation Context ...for n > 3t processors subject to Byzantine failures. The message complexity of consensus, when no failures actually occur, was studied by Amdur, Weber and Hadzilacos [2] and by Hadzilacos and Halpern =-=[19]-=-. Dolev and Reischuk [10] studied the message complexity of consensus in the case of Byzantine faults. They distinguished between pure Byzantine faults and a less demanding situation when some (crypto... |

16 |
Telephone problems with failures
- Berman, Hawrylycz
- 1986
(Show Context)
Citation Context ...in a failure-prone environment, either link failures or processor failures controlled by oblivious adversaries have been considered. Permanent link failures were first studied by Berman and Havrylycz =-=[5]-=-. Bagchi and Hakimi [3] investigated gossiping in networks with Byzantine node failures, in the case when nodes can test other nodes. Application of gossiping to gathering information about occurrence... |

12 |
Efficient Gossip and Robust Distributed Computation
- Georgiou, Kowalski, et al.
- 2003
(Show Context)
Citation Context ...sages. Communication performance achievable by an (n − 1)-resilient gossiping solution has recently been improved to O(n 1+ε ), while maintaining O(log 2 n) time, by Georgiou, Kowalski and Shvartsman =-=[18]-=-, where ε > 0 is an arbitrary constant which occurs in the code of the algorithm. These performance bounds were shown to be achievable by a constructive algorithm, whose code can be obtained in time t... |

11 |
On the message complexity of binary agreement under crash failures
- Amdur, Weber, et al.
- 1992
(Show Context)
Citation Context ...es and operating in t + 1 rounds, for n > 3t processors subject to Byzantine failures. The message complexity of consensus, when no failures actually occur, was studied by Amdur, Weber and Hadzilacos =-=[2]-=- and by Hadzilacos and Halpern [19]. Dolev and Reischuk [10] studied the message complexity of consensus in the case of Byzantine faults. They distinguished between pure Byzantine faults and a less de... |

10 |
Fast Gossiping with Short Unreliable Messages
- Chlebus, Diks, et al.
- 1994
(Show Context)
Citation Context ...nces of failures was proposed by van Renesse, Minsky and Hayden [35]. Gossiping with transient stochastic link failures and permanent stochastic node failures was considered by Chlebus, Diks and Pelc =-=[6]-=-. See a survey by Pelc [33] for more on the previous research on fault-tolerant broadcasting and gossiping. The problem of consensus was introduced by Pease, Shostak and Lamport [32]. They showed [29,... |

9 |
Information dissemination in distributed systems with faulty units
- Bagchi, Hakimi
- 1994
(Show Context)
Citation Context ...ronment, either link failures or processor failures controlled by oblivious adversaries have been considered. Permanent link failures were first studied by Berman and Havrylycz [5]. Bagchi and Hakimi =-=[3]-=- investigated gossiping in networks with Byzantine node failures, in the case when nodes can test other nodes. Application of gossiping to gathering information about occurrences of failures was propo... |

4 |
Survey of Gossiping and Communication Networks
- Hedetniemi, Hedetniemi, et al.
- 1988
(Show Context)
Citation Context ...o all neighbors can be sent and received concurrently. For surveys of such an approach to broadcasting and gossiping, see the papers by Fraigniaud and Lazard [15], Hedetniemi, Hedetniemi and Liestman =-=[22]-=-, Hromkovič, Klasing, Monien and Peine [23], and the book [24] by Hromkovič, Klasing, Pelc, Ruzicka and Unger. One may consider dissemination of information in a network similarly as spreading of a ru... |

3 |
Balancing work and communication in robust cooperative computation
- Kowalski, Shvartsman
(Show Context)
Citation Context ...Dwork, Peleg, Pippenger and Upfal [12]. This approach, to use networks of bounded degree and high connectivity to obtain fault-tolerance, was extended by Chlebus, Gasieniec, ↩ Kowalski and Shvartsman =-=[7]-=- in their work on the problem to perform independent and similar tasks in a message-passing environment with processor crashes. 2 Technical preliminaries In this section, we specify the distributed se... |

3 |
Explicit combinatorial structures for cooperative distributed algorithms
- Kowalski, Musial, et al.
(Show Context)
Citation Context ...code of the algorithm. These performance bounds were shown to be achievable by a constructive algorithm, whose code can be obtained in time that is polynomial in n, by Kowalski, Musial and Shvartsman =-=[28]-=-. We prove a trade-off between time and communication in gossiping, to gauge how far from optimality our algorithms are, when both time and communication are taken into account. II. We show that, if c... |