## The load and availability of byzantine quorum systems (1997)

### Cached

### Download Links

Venue: | SIAM Journal of Computing |

Citations: | 47 - 17 self |

### BibTeX

@INPROCEEDINGS{Malkhi97theload,

author = {Dahlia Malkhi and Michael K. Reiter and Avishai Wool},

title = {The load and availability of byzantine quorum systems},

booktitle = {SIAM Journal of Computing},

year = {1997},

pages = {249--257}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract. Replicated services accessed via quorums enable each access tobe performed at only a subset (quorum) of the servers and achieve consistency across accesses by requiring any two quorums to intersect. Recently, b-masking quorum systems, whose intersections contain at least 2b+1 servers, have been proposed to construct replicated services tolerant of b-arbitrary (Byzantine) server failures. In this paper we consider a hybrid fault model allowing benign failures in addition to the Byzantine ones. We present four novel constructions for b-masking quorum systems in this model, each of which has optimal load (the probability of access of the busiest server) or optimal availability (probability of some quorum surviving failures). To show optimality we also prove lower bounds on the load and availability of any b-masking quorum system in this model.

### Citations

549 | Weighted voting for replicated data
- Gifford
- 1979
(Show Context)
Citation Context ...n. In sections 5–7 we describe our new constructions. We discuss our results in section 8. 2. Related work. Our work borrows from extensive prior work in benignly fault-tolerant quorum systems (e.g., =-=[12, 39, 24, 11, 15, 4, 9, 1, 7, 31, 36]-=-). The notion of availability we use here (crash probability) is well known in reliability theory [5] and has been applied extensively in the analysis of quorum systems (cf. [4, 34, 35] and the refere... |

419 | Byzantine quorum systems
- Malkhi, Reiter
- 1997
(Show Context)
Citation Context ...uorums may intersect in a subset containing faulty servers only, which may deviate arbitrarily and undetectably from their assigned protocol. Malkhi and Reiter thus introduced masking quorums systems =-=[25]-=-, in which each pair of quorums intersects in sufficiently many servers to mask out the behavior of faulty servers. More precisely, a b-masking quorum system is one in which any two quorums intersect ... |

333 | A Majority Consensus Approach to Concurrency Control for Multiple Copy Databases
- Thomas
- 1979
(Show Context)
Citation Context ...n. In sections 5–7 we describe our new constructions. We discuss our results in section 8. 2. Related work. Our work borrows from extensive prior work in benignly fault-tolerant quorum systems (e.g., =-=[12, 39, 24, 11, 15, 4, 9, 1, 7, 31, 36]-=-). The notion of availability we use here (crash probability) is well known in reliability theory [5] and has been applied extensively in the analysis of quorum systems (cf. [4, 34, 35] and the refere... |

224 |
A sqrt(n) algorithm for mutual exclusion in decentralized systems
- Maekawa
- 1985
(Show Context)
Citation Context ...n. In sections 5–7 we describe our new constructions. We discuss our results in section 8. 2. Related work. Our work borrows from extensive prior work in benignly fault-tolerant quorum systems (e.g., =-=[12, 39, 24, 11, 15, 4, 9, 1, 7, 31, 36]-=-). The notion of availability we use here (crash probability) is well known in reliability theory [5] and has been applied extensively in the analysis of quorum systems (cf. [4, 34, 35] and the refere... |

179 | How to assign votes in a distributed system
- Garcia-Molina, Barbara
- 1985
(Show Context)
Citation Context |

178 |
Combinatorial Theory
- Hall
- 1967
(Show Context)
Citation Context ...em is a composition of a finite projective plane (FPP) over a threshold system (Thresh). The first component of a boostFPP system is a finite projective plane of order q (a good reference on FPP's is =-=[Hal86]-=-). It is known that FPP's exist for q = p r when p is prime. Such an FPP has nF = q 2 + q + 1 elements, and quorums of size c(FPP) = q + 1. This is a regular quorum system, i.e., it has intersections ... |

152 | A quorum-consensus replication method for abstract data types
- Herlihy
- 1986
(Show Context)
Citation Context |

122 |
Percolation Theory for Mathematicians
- Kesten
- 1982
(Show Context)
Citation Context ...the only construction we have for which Fp → 0asn →∞when the individual crash probability p is arbitrarily close to 1/2. We are able to prove this behavior of Fp using results from percolation theory =-=[18, 13]-=-. Remark. The system we present here is based on a triangular lattice, with elements corresponding to vertices, as in [41, 6]. We have also constructed a second system which is based on the square lat... |

117 |
The critical probability of bond percolation on the square lattice equals 1/2
- Kesten
- 1980
(Show Context)
Citation Context ...properties than graphs with p>pc. For example, Z with p<pc has a single connected (open) component of infinite size. When p>pc there is no such component. For site percolation on the triangle pc =1/2 =-=[17]-=-. The following theorem shows that when the probability p for a closed vertex is below the critical probability, the probability of having long open paths tends to 1 exponentially fast. Recall that LR... |

116 | M.Ahamad, “The Grid Protocol: A High Performance Scheme for Maintaining Replicated Data
- Cheung, Ammar
(Show Context)
Citation Context |

103 |
Statistical Theory of Reliability and Life
- BARLOW, PROSCHAN
- 1975
(Show Context)
Citation Context ...e prior work in benignly fault-tolerant quorum systems (e.g., [12, 39, 24, 11, 15, 4, 9, 1, 7, 31, 36]). The notion of availability we use here (crash probability) is well known in reliability theory =-=[5]-=- and has been applied extensively in the analysis of quorum systems (cf. [4, 34, 35] and the references therein). The load of a quorum system was first defined and analyzed in [31], which proved a low... |

102 |
Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix,” Paris: L’Imprimerie Royale
- Condorcet
(Show Context)
Citation Context ...ash(Q)). We would like Fp to be as small as possible. A desirable asymptotic behavior of Fp is that Fp → 0 when n →∞for all p<1/2, and such an Fp is called Condorcet (after the Condorcet jury theorem =-=[8]-=-). 4. Building blocks. In this section, we prove several theorems which will be our basic tools in what follows. First we prove lower bounds on the load and availability of b-masking quorum systems, a... |

97 | Availability in Partitioned Replicated Databases
- Abbadi, Toueg
- 1986
(Show Context)
Citation Context |

94 | Secure and scalable replication in phalanx
- Malkhi, Reiter
- 1998
(Show Context)
Citation Context ...ition 3.5 ensures that read operations can mask out any faulty behavior of up to b servers. Examples of protocols implementing various data abstractions using b-masking quorum systems can be found in =-=[25, 26, 27]-=-. Lemma 3.6. Let Q be a quorum system. Then Q is b-masking if both the following conditions hold:s1892 DAHLIA MALKHI, MICHAEL K. REITER, AND AVISHAI WOOL (1) MT (Q) ≥ b +1; (2) IS(Q) ≥ 2b +1. Proof. A... |

91 |
Hierarchical quorum consensus: A new algorithm for managing replicated data
- Kumar
- 1991
(Show Context)
Citation Context ...THE LOAD AND AVAILABILITY OF BYZANTINE QUORUM SYSTEMS 1897 3of4 3of4 3of4 3of4 3of4 Fig. 2. An RT(4, 3) system of depth h =2, with one quorum shaded. majority constructions of [29], the HQC system of =-=[19]-=- is an RT(3, 2) system, and in fact the threshold system of [25] can be viewed as a trivial RT(4b +1, 3b + 1) system with depth h = 1. As an example throughout this section we will use the RT(4, 3) sy... |

91 | The Load, Capacity, and Availability of Quorum Systems
- Naor, Wool
- 1998
(Show Context)
Citation Context |

81 | An efficient and faulttolerant solution for distributed mutual exclusion
- Agrawal, Abbadi
- 1991
(Show Context)
Citation Context |

64 |
Coincidence of critical points in percolation problems
- Menshikov
- 1986
(Show Context)
Citation Context ...rtex is below the critical probability, the probability of having long open paths tends to 1 exponentially fast. Recall that LR is the event “there exists an open LR path in the √ n × √ n grid.” Then =-=[30]-=- (see also [13, p. 287]) implies the following. Theorem B.1. If p<1/2, then P p(LR) ≥ 1 − e −ψ(p)√ n , for some ψ(p) > 0 independent of n. Remark. The dependence of ψ on p is such that ψ(p) → 0 when p... |

56 |
The reliability of vote mechanisms
- Barbara, Garcia-Molina
- 1987
(Show Context)
Citation Context |

50 |
On a sharp transition from area law to perimeter law in a system of random surfaces
- Aizenman, Chayes, et al.
- 1983
(Show Context)
Citation Context .... If LR is the event “there exists an open left-right path in a rectangle D,” then it follows that Ir(LR) is the event “there are at least r + 1 disjoint open left-right paths in D.” Theorem B.3 (see =-=[2]-=-). Let E be an increasing event and let r be a positive integer. Then 1 − P p(Ir(E)) ≤ � 1 − p p ′ �r [1 − P p − p ′(E)] whenever 0 ≤ p<p ′ ≤ 1. The theorem amounts to the assertion that if E is likel... |

45 | How to securely replicate services
- Reiter, Birman
- 1994
(Show Context)
Citation Context ...any b servers may experience Byzantine failures—that work gave two constructions. We compare those constructions to ours in section 8. Hybrid failure models have been considered in other works (e.g., =-=[10, 22, 23, 38]-=-). 3. Preliminaries. In this section we introduce notation and definitions used in the remainder of the paper. Much of the notation introduced in this section is summarized in Table 1 for quick refere... |

43 | A Formally Verified Algorithm for Interactive Consistency Under a Hybrid Fault Model, Fault-Tolerant Computing Symposium
- Lincoln, Rushby
- 1993
(Show Context)
Citation Context ...any b servers may experience Byzantine failures—that work gave two constructions. We compare those constructions to ours in section 8. Hybrid failure models have been considered in other works (e.g., =-=[10, 22, 23, 38]-=-). 3. Preliminaries. In this section we introduce notation and definitions used in the remainder of the paper. Much of the notation introduced in this section is summarized in Table 1 for quick refere... |

32 | A: Crumbling walls: a class of practical and efficient quorum systems
- Peleg, Wool
(Show Context)
Citation Context |

25 |
A Fault Tolerant Algorithm for Replicated Data Management
- Rangarajan, Setia, et al.
- 1995
(Show Context)
Citation Context ...ents. However, this is not an artifact of overestimates in our analysis. Rather, it is a result of the property that the crash probability of FPP is higher than p, and in fact Fp(FPP) → 1 as shown by =-=[37, 40]-=-. In this light it is not surprising that boostFPP does not have an optimal crash probability. • The requirement p<1/4 is essential for this system; if p>1/4, then in fact Fp(boostFPP) → 1asn →∞. 7. T... |

24 |
The vulnerability of vote assignments
- Barbara, Garcia-Molina
- 1986
(Show Context)
Citation Context ... f of a quorum system provides one measure of how many crash failures a quorum system is guaranteed to survive, and indeed this measure has been used in the past to differentiate among quorum systems =-=[3]-=-. However, it is possible that an f-resilient quorum system, though vulnerable to a few failure configurations of f + 1 failures, can survive many configurations of more than f failures. One way to me... |

24 | A high availability p n hierarchical grid algorithm for replicated data
- Kumar, Cheung
- 1991
(Show Context)
Citation Context ...its poor asymptotic crash probability. If crashes occur with some constant probability p, then any configuration of crashes with at least one crash per row disables the system. Therefore, as shown by =-=[20, 40]-=-, Fp(M-Grid) ≥ (1 − (1 − p) √ n ) √ n −→ n→∞ 1. 5.2. RT systems. An RT system RT(k, ℓ) of depth h is built by taking a simple building block, which is an ℓ-of-k threshold system (with k>ℓ>k/2), and re... |

23 | A continuum of failure models for distributed computing
- Garay, Perry
- 1992
(Show Context)
Citation Context ...any b servers may experience Byzantine failures—that work gave two constructions. We compare those constructions to ours in section 8. Hybrid failure models have been considered in other works (e.g., =-=[10, 22, 23, 38]-=-). 3. Preliminaries. In this section we introduce notation and definitions used in the remainder of the paper. Much of the notation introduced in this section is summarized in Table 1 for quick refere... |

22 |
A performance study of general grid structures for replicated data
- Kumar, Rabinovich, et al.
- 1993
(Show Context)
Citation Context ...optimality of b our constructions, we generalize this lower bound to Ω( n ) for b-masking quorum systems. Grids, which form the basis for our multigrid (denoted M-Grid) construction, were proposed in =-=[24, 7, 21, 25]-=-. The technique of quorum composition, which we use in our recursive threshold (RT) and boosted finite projective planes (boostFPP) constructions, has been studied in [29, 33, 32] under various names ... |

17 | Load balancing in quorum systems
- Marcus
- 1992
(Show Context)
Citation Context ...d not of the protocol using it. Examples of load calculations can be found in [40]. As an aside, we note that not every quorum system can have a strategy that induces the same load on each server. In =-=[16]-=- it is shown that for some quorum systems it is impossible to balance the load perfectly. Recall that c(Q) denotes the cardinality of the smallest quorum in Q. The next result will be useful to us in ... |

14 | Formal verification of an interactive consistency algorithm for the Draper FTP architecture under a hybrid fault model
- Lincoln, Rushby
- 1994
(Show Context)
Citation Context |

11 | Quorum structures in distributed systems - Neilsen - 1992 |

10 |
Planar quorums
- Bazzi
- 1996
(Show Context)
Citation Context ...h as “coterie join” and “recursive majority.” Our multipath (M-Path) construction generalizes the system of [41], coupled with the analysis of the Paths construction of [31], and the recent system of =-=[6]-=-. Several constructions of masking quorum systems were given in [25] for a variety of failure models. For the model we consider here—i.e., any b servers may experience Byzantine failures—that work gav... |

10 |
Quorum systems for distributed control protocols
- Wool
- 1996
(Show Context)
Citation Context ...nly in the case that no failures occur. A strength of this definition is that the load is a property of a quorum system and not of the protocol using it. Examples of load calculations can be found in =-=[40]-=-. As an aside, we note that not every quorum system can have a strategy that induces the same load on each server. In [16] it is shown that for some quorum systems it is impossible to balance the load... |

9 | Coterie join algorithm
- Neilsen, Mizuno
- 1992
(Show Context)
Citation Context ... were proposed in [24, 7, 21, 25]. The technique of quorum composition, which we use in our recursive threshold (RT) and boosted finite projective planes (boostFPP) constructions, has been studied in =-=[29, 33, 32]-=- under various names such as “coterie join” and “recursive majority.” Our multipath (M-Path) construction generalizes the system of [41], coupled with the analysis of the Paths construction of [31], a... |

9 |
The triangular lattice protocol: A highly fault tolerant protocol for replicated data
- Wu, Belford
- 1992
(Show Context)
Citation Context ...ve planes (boostFPP) constructions, has been studied in [29, 33, 32] under various names such as “coterie join” and “recursive majority.” Our multipath (M-Path) construction generalizes the system of =-=[41]-=-, coupled with the analysis of the Paths construction of [31], and the recent system of [6]. Several constructions of masking quorum systems were given in [25] for a variety of failure models. For the... |

8 | Probabilistic Byzantine Quorum Systems - Malkhi, Reiter, et al. - 2008 |

7 |
A: The availability of crumbling wall quorum systems
- Peleg, Wool
(Show Context)
Citation Context ...1, 15, 4, 9, 1, 7, 31, 36]). The notion of availability we use here (crash probability) is well known in reliability theory [5] and has been applied extensively in the analysis of quorum systems (cf. =-=[4, 34, 35]-=- and the references therein). The load of a quorum system was first defined and analyzed in [31], which proved a lower bound of Ω( 1 √ n ) on the load of any quorum system (and, a fortiori, any maskin... |

6 |
Peleg D: Construction methods for quorum systems
- Marcus
- 1992
(Show Context)
Citation Context ... were proposed in [24, 7, 21, 25]. The technique of quorum composition, which we use in our recursive threshold (RT) and boosted finite projective planes (boostFPP) constructions, has been studied in =-=[29, 33, 32]-=- under various names such as “coterie join” and “recursive majority.” Our multipath (M-Path) construction generalizes the system of [41], coupled with the analysis of the Paths construction of [31], a... |

4 |
Survivable consensus objects
- Malkhi, Reiter
- 1998
(Show Context)
Citation Context ...ition 3.5 ensures that read operations can mask out any faulty behavior of up to b servers. Examples of protocols implementing various data abstractions using b-masking quorum systems can be found in =-=[25, 26, 27]-=-. Lemma 3.6. Let Q be a quorum system. Then Q is b-masking if both the following conditions hold:s1892 DAHLIA MALKHI, MICHAEL K. REITER, AND AVISHAI WOOL (1) MT (Q) ≥ b +1; (2) IS(Q) ≥ 2b +1. Proof. A... |

1 |
also available online from http://www.research.att.com/library/trs/TRs/98/98.7/98.7.1.body.ps.gz
- Malkhi, Reiter, et al.
- 1998
(Show Context)
Citation Context ...eously: since necessarily f ≤ c(Q), Theorem 4.1 implies that f ≤ nL(Q), i.e., when load is low then so is resilience, and when resilience is high then so is load. In order to break this trade-off, in =-=[28]-=- we propose relaxing the intersection property of masking quorum systems, so that “quorums” chosen according to a specific strategy intersect each other in enough correct servers to maintain correctne... |

1 | Planar quorums - Brmzi - 1996 |

1 | The reliability of vote mechrminms - Barbara, Garcia-Molina - 1987 |

1 | Statistical Theory o~ Relidility and Life - Barlow, Proschan - 1975 |

1 | Essai sur I’application de l’analyse h la probabilit~ ales.decisions rendues h la pluralite des voix - Condorcet |

1 | Combinatorial Theory - Percolation - 1986 |

1 | The critical probabllit y of bond percolation on the square lattice equals - Kesten - 1980 |

1 | Coincidence of critical points in percolation problemn - Menshlkov - 1986 |

1 | Computing and Information Sciences - thesis, Dept - 1962 |

1 | The load, capacity and availabilityy of quorum systems - Sy - 1992 |

1 | A majority consensus apprmach to concurrency control for multiple copy databases - Thomas - 1979 |