## The Bloomier Filter: An Efficient Data Structure for Static Support Lookup Tables (2004)

### Cached

### Download Links

- [www.cs.princeton.edu]
- [www.ee.technion.ac.il]
- [www.eecs.harvard.edu]
- [www-ee.technion.ac.il]
- [webee.technion.ac.il]
- [www.cs.princeton.edu]
- CiteULike
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA |

Citations: | 64 - 0 self |

### BibTeX

@INPROCEEDINGS{Chazelle04thebloomier,

author = {Bernard Chazelle and Joe Kilian and Ronitt Rubinfeld and Ayellet Tal and Oh Boy},

title = {The Bloomier Filter: An Efficient Data Structure for Static Support Lookup Tables},

booktitle = {In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA},

year = {2004},

pages = {30--39}

}

### Years of Citing Articles

### OpenURL

### Abstract

We introduce the Bloomier filter, a data structure for compactly encoding a function with static support in order to support approximate evaluation queries. Our construction generalizes the classical Bloom filter, an ingenious hashing scheme heavily used in networks and databases, whose main attribute -- space efficiency -- is achieved at the expense of a tiny false-positive rate. Whereas Bloom filters can handle only set membership queries, our Bloomier filters can deal with arbitrary functions. We give several designs varying in simplicity and optimality, and we provide lower bounds to prove the (near) optimality of our constructions.

### Citations

1475 | Space/time trade-offs in hash coding with allowable errors
- Bloom
- 1970
(Show Context)
Citation Context ...er these issues for the more prosaic example of Bloom filters, described below. Historical background Bloom filters yield an extremely compact data structure that supports membership queries to a set =-=[1]-=-. Their space requirements fall significantly below the information theoretic lower bounds for error-free data structures. They achieve their efficiency at the cost of a small false positive rate (ite... |

691 | Summary cache: a scalable wide-area web cache sharing protocol
- Fan, Cao, et al.
- 2000
(Show Context)
Citation Context ... differential files access, and to compute joins and semijoins [7, 11, 14, 18, 20, 24]. Bloom filters are also used for approximating membership checking of password data structures [21], web caching =-=[10, 27]-=-, and spell checking [22]. Several variants of Bloom filters have been proposed. Attenuated Bloom filters [26] use arrays of Bloom filters to store shortest path distance information. Spectral Bloom f... |

353 | Network applications of bloom filters: A survey
- Broder, Mitzenmacher
- 2004
(Show Context)
Citation Context ...e set are always recognized as being in the set). Bloom filters are widely used in practice when storage is at a premium and an occasional false positive is tolerable. They have many uses in networks =-=[2]-=-: for collaborating in overlay and peer-to-peer networks [5, 8, 17], resource routing [15, 26], packet routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed ... |

319 | New directions in traffic measurement and accounting
- Estan, Varghese
- 2002
(Show Context)
Citation Context ... tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks [5, 8, 17], resource routing [15, 26], packet routing [12, 30], and measurement infrastructures =-=[9, 29]-=-. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semijoins [7, 11, 14, 18, 20, 24]. Bloom filters are also used for app... |

280 | Expander codes
- Sipser, Spielman
- 1996
(Show Context)
Citation Context ...ment of Bloom filters into a cascading pipeline. This yields a practical solution, which is also theoretically nearoptimal. To optimize the data structure, we change tack and pursue, in the spirit of =-=[4, 6, 19, 28], an-=- algebraic approach based on the expander-like properties of random hash functions. As with bloom filters, we assume that we can use “ideal” hash functions. We analyze our algorithms in this model... |

219 |
Storing a sparse table with O(1) worst case access time
- Fredman, Koml¶os, et al.
- 1984
(Show Context)
Citation Context ...d storage used is equal to E α� i=0 2 ki kni = kn α� i=0 2 −(ki −k)/(k−1) = O(km). Note that, if N is polynomial in n, we can stop the recursion when ni is about n/ log n and then use perf=-=ect hashing [3, 13]-=-. This requires constant time and O(n) bits of extra storage. To summarize, with high probability a random set of hash functions provides a Bloomier filter with the following characteristics: (i) the ... |

198 | Compressed Bloom Filters
- Mitzenmacher
- 2001
(Show Context)
Citation Context ...it but rather a small counter. Insertions and deletions to the filter increment or decrement the counters respectively. When the filter is intended to be passed as asmessage, compressed Bloom filters =-=[23]-=- may be used instead, where parameters can be adjusted to the desired tradeoff between size and false-positive rate. We note that a standard technique for eliminating a very small number of troublesom... |

155 | PlanetP: Using gossiping to build content addressable peer-to-peer information sharing communities
- Cuenca-Acuna, Peery, et al.
- 2003
(Show Context)
Citation Context ...ers are widely used in practice when storage is at a premium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks =-=[5, 8, 17]-=-, resource routing [15, 26], packet routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, an... |

127 | Probabilistic Location and Routing, in
- Rhea, Kubiatowicz
(Show Context)
Citation Context ...ce when storage is at a premium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks [5, 8, 17], resource routing =-=[15, 26]-=-, packet routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semi... |

97 | Stochastic Fair Blue: A Queue Management Algorithm for Enforcing Fairness
- Feng, Kandlur, et al.
- 2001
(Show Context)
Citation Context ...remium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks [5, 8, 17], resource routing [15, 26], packet routing =-=[12, 30]-=-, and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semijoins [7, 11, 14, 18, 20,... |

64 |
Membership in constant time and almost-minimum space
- Brodnik, Munro
- 1999
(Show Context)
Citation Context ...(nr log N) (which is achieved by merely listing the values of all of the elements in the set) and, in the 0/1 case, the O(n log N n ) bound achieved by the perfect hashing method of Brodnik and Munro =-=[3]-=-. (Of course, unlike ours, neither of these methods ever errs.) Bloomier filters are further generalized to handle dynamic updates. One can query and update function values in constant time while keep... |

54 | Are bitvectors optimal
- Buhrman, Miltersen, et al.
- 2002
(Show Context)
Citation Context ...ment of Bloom filters into a cascading pipeline. This yields a practical solution, which is also theoretically nearoptimal. To optimize the data structure, we change tack and pursue, in the spirit of =-=[4, 6, 19, 28], an-=- algebraic approach based on the expander-like properties of random hash functions. As with bloom filters, we assume that we can use “ideal” hash functions. We analyze our algorithms in this model... |

43 |
Development of a spelling list
- McIlroy
- 1982
(Show Context)
Citation Context ...nd to compute joins and semijoins [7, 11, 14, 18, 20, 24]. Bloom filters are also used for approximating membership checking of password data structures [21], web caching [10, 27], and spell checking =-=[22]-=-. Several variants of Bloom filters have been proposed. Attenuated Bloom filters [26] use arrays of Bloom filters to store shortest path distance information. Spectral Bloom filters [7] extend the dat... |

41 |
Families of k-independent sets
- Kleitman, Spencer
- 1973
(Show Context)
Citation Context ...ting set of vectors z c is (N, n)-universal (meaning that the restrictions of the vectors z c to any given choice of n coordinate positions produce all possible 2 n patterns). By Kleitman and Spencer =-=[16], the -=-number of such vectors is known to be Ω(2 n log N). For the upper bound, we use the existence of a (N, 2n)-universal set of vectors of size O(n2 n log N)—also established in [16]— and turn all z... |

32 |
Geographical region summary service for geographical routing
- Hsiao
- 2001
(Show Context)
Citation Context ...ce when storage is at a premium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks [5, 8, 17], resource routing =-=[15, 26]-=-, packet routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semi... |

32 |
Optimal semijoins for distributed database systems
- Mullin
- 1990
(Show Context)
Citation Context ...et routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semijoins =-=[7, 11, 14, 18, 20, 24]-=-. Bloom filters are also used for approximating membership checking of password data structures [21], web caching [10, 27], and spell checking [22]. Several variants of Bloom filters have been propose... |

31 |
Applications of matrix methods to the theory of lower bounds in computational complexity
- Razborov
- 1990
(Show Context)
Citation Context ...e of a (N, 2n)-universal set of vectors of size O(n2 n log N)—also established in [16]— and turn all zeroes into minus ones. (Alternatively, we can use Razborov’s bound on the size of separating=-= sets [25].)-=- Each node is colored by picking a vector from the universal set that matches the the ones and minus ones of the vector associated with that node. ♦ Going back to the randomized model of Bloomier fi... |

28 |
Randomness Conductors and Constant degree Expansions beyond the degree 2
- Capalbo, Reingold, et al.
- 2002
(Show Context)
Citation Context ...ment of Bloom filters into a cascading pipeline. This yields a practical solution, which is also theoretically nearoptimal. To optimize the data structure, we change tack and pursue, in the spirit of =-=[4, 6, 19, 28], an-=- algebraic approach based on the expander-like properties of random hash functions. As with bloom filters, we assume that we can use “ideal” hash functions. We analyze our algorithms in this model... |

28 | An Algorithm for Approximate Membership Checking with Application to Password Security
- Manber, Wu
- 1994
(Show Context)
Citation Context ...t iceberg queries, differential files access, and to compute joins and semijoins [7, 11, 14, 18, 20, 24]. Bloom filters are also used for approximating membership checking of password data structures =-=[21]-=-, web caching [10, 27], and spell checking [22]. Several variants of Bloom filters have been proposed. Attenuated Bloom filters [26] use arrays of Bloom filters to store shortest path distance informa... |

23 | Self-organization in peer-to-peer systems
- Ledlie, Taylor, et al.
(Show Context)
Citation Context ...ers are widely used in practice when storage is at a premium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks =-=[5, 8, 17]-=-, resource routing [15, 26], packet routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, an... |

12 |
Informed content delivery over adaptive overlay networks
- Byers, Considine, et al.
- 2004
(Show Context)
Citation Context ...ers are widely used in practice when storage is at a premium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks =-=[5, 8, 17]-=-, resource routing [15, 26], packet routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, an... |

12 |
Designing a Bloom Filter for Differential File Access
- Gremillion
- 1982
(Show Context)
Citation Context ...et routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semijoins =-=[7, 11, 14, 18, 20, 24]-=-. Bloom filters are also used for approximating membership checking of password data structures [21], web caching [10, 27], and spell checking [22]. Several variants of Bloom filters have been propose... |

8 | Perf join: An alternative to two-way semijoin and bloomjoin
- Li, Ross
- 1995
(Show Context)
Citation Context ...et routing [12, 30], and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semijoins =-=[7, 11, 14, 18, 20, 24]-=-. Bloom filters are also used for approximating membership checking of password data structures [21], web caching [10, 27], and spell checking [22]. Several variants of Bloom filters have been propose... |

4 |
optimizer validation and performance for distributed queries
- Mackert, Lohman
- 1986
(Show Context)
Citation Context |

4 |
Forwarding without loops
- Whitaker, Wetherall
- 2002
(Show Context)
Citation Context ...remium and an occasional false positive is tolerable. They have many uses in networks [2]: for collaborating in overlay and peer-to-peer networks [5, 8, 17], resource routing [15, 26], packet routing =-=[12, 30]-=-, and measurement infrastructures [9, 29]. Bloom filters are used in distributed databases to support iceberg queries, differential files access, and to compute joins and semijoins [7, 11, 14, 18, 20,... |

2 |
BLT codes,[ in Proc. 43rd Annu
- Luby
- 2002
(Show Context)
Citation Context |

1 |
Computing iceberg queries efficiently
- unknown authors
- 1998
(Show Context)
Citation Context |