## Verifying distributed erasure-coded data (2007)

### Cached

### Download Links

- [www.cs.unc.edu]
- [www.pdl.cmu.edu]
- [www.ssrc.ucsc.edu]
- [www.pdl.cs.cmu.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the 26 th ACM Symposium on Principles of Distributed Computing |

Citations: | 13 - 1 self |

### BibTeX

@INPROCEEDINGS{Hendricks07verifyingdistributed,

author = {James Hendricks},

title = {Verifying distributed erasure-coded data},

booktitle = {In Proceedings of the 26 th ACM Symposium on Principles of Distributed Computing},

year = {2007},

pages = {163--168},

publisher = {ACM Press}

}

### OpenURL

### Abstract

Erasure coding can reduce the space and bandwidth overheads of redundancy in fault-tolerant data storage and delivery systems. But it introduces the fundamental difficulty of ensuring that all erasurecoded fragments correspond to the same block of data. Without such assurance, a different block may be reconstructed from different subsets of fragments. This paper develops a technique for providing this assurance without the bandwidth and computational overheads associated with current approaches. The core idea is to distribute with each fragment what we call homomorphic fingerprints. These fingerprints preserve the structure of the erasure code and allow each fragment to be independently verified as corresponding to a specific block. We demonstrate homomorphic fingerprinting functions that are secure, efficient, and compact.

### Citations

1349 | Random oracles are practical: A paradigm for designing efficient protocols
- Bellare, Rogaway
- 1993
(Show Context)
Citation Context ...printing function is chosen at random with respect to the fragments being fingerprinted. This “random” selection can be deterministic with the appropriate application of a cryptographic hash function =-=[3]-=-. If data is represented carefully, the remainder from division by a random irreducible polynomial [24] or the evaluation of a polynomial at a random point preserve the needed algebraic structure. The... |

944 | OceanStore: An architecture for global-scale persistent storage
- Kubiatowicz, Bindel, et al.
(Show Context)
Citation Context ... is 20 microseconds. After this computation, this implementation achieves a throughput of 410 megabytes per second. 6. OTHER PROTOCOLS m-of-n erasure coding is used in many distributed systems (e.g., =-=[1, 6, 8, 13, 18, 28]-=-), because it reduces storage, network bandwidth, and I/O bandwidth. The savings approaches a factor of m when compared to replication. The division and evaluation fingerprinting functions are homomor... |

457 | Efficient dispersal of information for security, load balancing, and fault tolerance - Rabin - 1989 |

448 | SplitStream: HighBandwidth Multicast in a Cooperative Environment
- Castro, Druschel, et al.
- 2003
(Show Context)
Citation Context ... is 20 microseconds. After this computation, this implementation achieves a throughput of 410 megabytes per second. 6. OTHER PROTOCOLS m-of-n erasure coding is used in many distributed systems (e.g., =-=[1, 6, 8, 13, 18, 28]-=-), because it reduces storage, network bandwidth, and I/O bandwidth. The savings approaches a factor of m when compared to replication. The division and evaluation fingerprinting functions are homomor... |

368 |
Polynomial codes over certain finite fields
- Reed, Solomon
- 1960
(Show Context)
Citation Context .... Thus, (n − m) of the fragments can be unavailable (e.g., due to corruption or server failure) without loss of access. Example erasure coding schemes with these properties include Reed-Solomon codes =-=[26]-=- and Rabin’s Information Dispersal Algorithm [25]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are n... |

233 |
Fingerprinting by random polynomials
- Rabin
- 1981
(Show Context)
Citation Context ... selection can be deterministic with the appropriate application of a cryptographic hash function [3]. If data is represented carefully, the remainder from division by a random irreducible polynomial =-=[24]-=- or the evaluation of a polynomial at a random point preserve the needed algebraic structure. The resulting fingerprints are secure, efficient, and compact. The rest of this paper is organized as foll... |

232 |
A practical scheme for non-interactive verifiable secret sharing
- Feldman
- 1987
(Show Context)
Citation Context ... In [4], algebraic properties are leveraged to permit fast updates of Rabin fingerprints of data structures such as trees. More distantly related to this technique is verifiable secret sharing (e.g., =-=[9, 10, 23, 31]-=-), which allows correct participants to verify that a secret was shared among them consistently. The secrecy of the shared value, however, which must be preserved throughout the share distribution and... |

203 |
Verifiable secret sharing and achieving simultaneity in the presence of faults
- Chor, Goldwasser, et al.
- 1985
(Show Context)
Citation Context ... In [4], algebraic properties are leveraged to permit fast updates of Rabin fingerprints of data structures such as trees. More distantly related to this technique is verifiable secret sharing (e.g., =-=[9, 10, 23, 31]-=-), which allows correct participants to verify that a secret was shared among them consistently. The secrecy of the shared value, however, which must be preserved throughout the share distribution and... |

120 | Publicly verifiable secret sharing
- Stadler
- 1996
(Show Context)
Citation Context ... In [4], algebraic properties are leveraged to permit fast updates of Rabin fingerprints of data structures such as trees. More distantly related to this technique is verifiable secret sharing (e.g., =-=[9, 10, 23, 31]-=-), which allows correct participants to verify that a secret was shared among them consistently. The secrecy of the shared value, however, which must be preserved throughout the share distribution and... |

118 |
LFSR based hashing and authentication
- Krawczyk
- 1994
(Show Context)
Citation Context ...t trusted to be consistent without verification. 7. RELATED WORK A common cryptographic application of universal hashing is for message authentication codes (MACs) [22]. An early proposal by Krawczyk =-=[16]-=- included a MAC similar to Rabin’s fingerprints. Shoup presented faster variants [30] along with implementation suggestions to optimize performance. Nevelsteen compares several other variants [22]. Ho... |

108 |
Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories
- Mehlhorn, Vishkin
- 1994
(Show Context)
Citation Context ...ut r ∈ K uniformly at random. Then the function fp(r,d) : K × F δ q k → F γ q k defined by fp(r,d(y,x)) : s(x) ← S(r); return d(s(x),x) is an ε-fingerprinting function for ε = δ/γ q kγ . PROOF. As in =-=[20]-=-, this is because any ⌈ δ γ ⌉ points fully determine a polynomial of degree less than δ γ over a field. Hence, any two distinct polynomials of degree less than δ γ share fewer than δ γ points. Because... |

107 | Fab: building distributed enterprise disk arrays from commodity components
- Saito, Frolund, et al.
- 2004
(Show Context)
Citation Context ... is 20 microseconds. After this computation, this implementation achieves a throughput of 410 megabytes per second. 6. OTHER PROTOCOLS m-of-n erasure coding is used in many distributed systems (e.g., =-=[1, 6, 8, 13, 18, 28]-=-), because it reduces storage, network bandwidth, and I/O bandwidth. The savings approaches a factor of m when compared to replication. The division and evaluation fingerprinting functions are homomor... |

91 | On-the-fly verification of rateless erasure codes for efficient content distribution
- KROHN, FREEDMAN, et al.
- 2004
(Show Context)
Citation Context ...recomputation. The “repairable” protocol can also benefit, but requires further modifications. Homomorphic fingerprinting may also provide benefits to erasurecoded broadcast [8], content distribution =-=[17]-=-, and similar applications, if the encoding is not trusted to be consistent without verification. 7. RELATED WORK A common cryptographic application of universal hashing is for message authentication ... |

90 | Some applications of Rabin’s fingerprinting method
- Broder
- 1993
(Show Context)
Citation Context ...abin used these properties to update the fingerprint of a file [24]. In [29], a similar technique is used by a disk scrubber to check the consistency of erasure-coded data in a benign environment. In =-=[4]-=-, algebraic properties are leveraged to permit fast updates of Rabin fingerprints of data structures such as trees. More distantly related to this technique is verifiable secret sharing (e.g., [9, 10,... |

80 | Efficient Byzantine-tolerant erasure-coded storage
- Goodson, Wylie, et al.
(Show Context)
Citation Context ... bandwidth overheads are no better than for replication. In the second approach, clients verify all n fragments when they perform a read to ensure that no other client could observe a different value =-=[13]-=-. In this approach, each fragment is accompanied by a cross-checksum [12, 15], which consists of a hash of each of the n fragments. A reader verifies the cross-checksum by reconstructing a block from ... |

75 | Cryptographic hash-function basics: Definitions, implications, and separations for preimage resistance, second-preimage resistance, and collision resistance
- Rogaway, Shrimpton
- 2004
(Show Context)
Citation Context ...cture that we call a fingerprinted cross-checksum. Before considering the contents of a fingerprinted cross-checksum, recall the following definition of a collision-resistant hash function (e.g., see =-=[27]-=-). DEFINITION 3.1. A family of hash functions {hashK : {0,1} ∗ → {0,1} λ }K∈K ′ is (τ,ε′ )-collision resistant if for every probabilistic algorithm A that runs in time τ, � d ′ �= d ∧ hashK(d Pr ′ ) =... |

73 | Incremental cryptography: The case of hashing and signing
- BELLARE, GOLDREICH, et al.
- 1994
(Show Context)
Citation Context ...plementation suggestions to optimize performance. Nevelsteen compares several other variants [22]. Homomorphic fingerprinting functions share homomorphic properties with incremental hashing functions =-=[2]-=-. Incremental hashing, however, is substantially slower because it is based on numbertheoretic primitives. The homomorphic properties of incremental hashing are exploited in [17], which applies these ... |

69 |
Distributed provers with applications to undeniable signatures
- Pedersen
- 1991
(Show Context)
Citation Context |

67 | On fast and provably secure message authentication based on universal hashing
- Shoup
- 1996
(Show Context)
Citation Context ...ility that p(x) is one δ of these polynomials is at most qkγ−q kγ . 2 Division fingerprinting is a generalization of Rabin fingerprinting. Both are fast due to fast implementations of P2 [24] and Pqk =-=[30]-=-. Let Eqkγ = Fqk[x]/p(x) denote the extension field of polynomials with coefficients in Fqk of degree less than γ, with “+” defined as normal and “·” defined modulo a constant monic degree-γ irreducib... |

57 |
Online codes
- Maymounkov
- 2002
(Show Context)
Citation Context ...ar erasure codes over F q k. A common field is F 2 8 such that field elements are bytes. Rabin fingerprinting is homomorphic over many erasure codes based solely on exclusive-or, such as Online Codes =-=[19]-=- and parity. Homomorphic fingerprinting provides benefits to erasure-coded Byzantine fault-tolerant storage systems [6, 13]. Section 4 demonstrated how the AVID protocol [5], used in [6], can exploit ... |

39 |
M.N.: Universal classes of hash functions (extended abstract
- Carter, Wegman
- 1977
(Show Context)
Citation Context ...wo common approaches described above could be used without the bandwidth or computation overheads, respectively. The fingerprinting functions we propose belong to a family of universal hash functions =-=[7]-=-, chosen to preserve the underlying algebraic constraints of the fragments. A particular fingerprinting function is chosen at random with respect to the fragments being fingerprinted. This “random” se... |

31 |
Distributed fingerprints and secure information dispersal
- Krawczyk
- 1993
(Show Context)
Citation Context ...proach, clients verify all n fragments when they perform a read to ensure that no other client could observe a different value [13]. In this approach, each fragment is accompanied by a cross-checksum =-=[12, 15]-=-, which consists of a hash of each of the n fragments. A reader verifies the cross-checksum by reconstructing a block from m fragments and then recomputing the other (n − m) fragments and comparing th... |

29 | Optimal Resilience for Erasure-Coded Byzantine Distributed Storage
- Cachin, Tessaro
(Show Context)
Citation Context |

29 | Distributed pseudo-random functions and KDCs
- Naor, Pinkas, et al.
- 1999
(Show Context)
Citation Context ...ues that are significantly heavierweight than considered here. It is worth mentioning that a random oracle, as in Section 3, can be replaced with an evaluation of a distributed pseudo-random function =-=[21]-=- in a protocol such as AVID-FP. This construction has the benefit of requiring only standard cryptographic assumptions. 8. CONCLUSION Homomorphic fingerprinting enables efficient verification that fra... |

26 | Software performance of universal hash functions
- Nevelsteen, Preneel
(Show Context)
Citation Context ...ing known as the division and evaluation hashes can be used for message authentication. They are two of the fastest hashes, producing the smallest output and requiring the fewest bits of random input =-=[22]-=-.s2.2 Homomorphism Throughout this paper, let b · d denote the application of “·” by a scalar b ∈ F to each element in a vector d ∈ F σ of σ elements of F. DEFINITION 2.5. A fingerprinting function fp... |

15 |
Securely replicating authentication services
- Gong
- 1989
(Show Context)
Citation Context ...proach, clients verify all n fragments when they perform a read to ensure that no other client could observe a different value [13]. In this approach, each fragment is accompanied by a cross-checksum =-=[12, 15]-=-, which consists of a hash of each of the n fragments. A reader verifies the cross-checksum by reconstructing a block from m fragments and then recomputing the other (n − m) fragments and comparing th... |

10 | Using Erasure Codes Efficiently for Storage in a Distributed System
- Aguilera, Janakiraman, et al.
- 2005
(Show Context)
Citation Context ...cross-checksum. DEFINITION 3.3. Let fpcc be a fingerprinted cross-checksum. A fragment d ∈ F δ is consistent with fpcc for index i, 1 ≤ i ≤ n, if and fpcc.cc[i] = hash(d) fp(r,d) = encode γ i (fpcc.fp=-=[1]-=-,...,fpcc.fp[m]) where r = random oracle(fpcc.cc[1],...,fpcc.cc[n]). THEOREM 3.4. Let A be a probabilistic algorithm that runs in time τ, makes χ queries to random oracle, and produces an m-of-n fpcc ... |

10 | The safety and liveness properties of a protocol family for versatile survivable storage infrastructures
- Goodson, Wylie, et al.
- 2004
(Show Context)
Citation Context ...-tolerant storage systems [6, 13]. Section 4 demonstrated how the AVID protocol [5], used in [6], can exploit homomorphic fingerprinting to be more bandwidth efficient. Variants of the PASIS protocol =-=[13, 14]-=- can also exploit homomorphic fingerprinting. In the “non-repairable” protocol a writer sends fragments along with a cross-checksum to each server; a reader returns a block after finding sufficient se... |

8 | Asynchronous Verifiable Information Dispersal
- Cachin, Tessaro
- 2005
(Show Context)
Citation Context ... and distribute data correctly use one of two approaches. In the first approach, servers are provided the entire block of data, allowing them to agree on the contents and generate their own fragments =-=[5, 6]-=-. Savings are achieved for storage, but bandwidth overheads are no better than for replication. In the second approach, clients verify all n fragments when they perform a read to ensure that no other ... |

6 |
Verification of parity data in large scale storage systems
- Schwarz
- 2004
(Show Context)
Citation Context ...in a peer-to-peer content distribution network. The algebraic properties of certain universal hashes has been examined before. Rabin used these properties to update the fingerprint of a file [24]. In =-=[29]-=-, a similar technique is used by a disk scrubber to check the consistency of erasure-coded data in a benign environment. In [4], algebraic properties are leveraged to permit fast updates of Rabin fing... |

2 |
SHA1, SHA2, HMAC and Key Derivation in C,” http://fp.gladman.plus. com/cryptography_technology/sha/index.htm, accessed 27
- Gladman
- 2008
(Show Context)
Citation Context ...(x). This table will contain 28 entries of γ bytes each, for a total of 4 kB for a 128-bit fingerprint, and it can be computed before the random value r is selected. Gladman’s implementation of SHA-1 =-=[11]-=- achieves a throughput of 110 megabytes per second on a 3 GHz Intel Pentium D. On this machine, the time to compute lookup tables for the evaluation fingerprint implementation presented here is 20 mic... |

2 |
On irreducible polynomials in Galois fields
- Travis
- 1963
(Show Context)
Citation Context ...) is an ε-fingerprinting function for ε = δ qkγ−q kγ 2 ≈ δ q kγ . PROOF. As in [24], this is because there are at least qkγ −q kγ 2 γ monic degree-γ irreducible polynomials with coefficients in F q k =-=[32]-=-, of which any nonzero degree-δ polynomial with coefficients in Fqk may have at most ⌊ δ γ ⌋ factors of degree-γ. Consider the difference of any two distinct polynomials with matching fingerprints. Le... |