## Some Open Questions Related to Cuckoo Hashing

Citations: | 4 - 1 self |

### BibTeX

@MISC{Mitzenmacher_someopen,

author = {Michael Mitzenmacher},

title = {Some Open Questions Related to Cuckoo Hashing},

year = {}

}

### OpenURL

### Abstract

Abstract. The purpose of this brief note is to describe recent work in the area of cuckoo hashing, including a clear description of several open problems, with the hope of spurring further research. 1

### Citations

738 |
The Art of Computer Programming, Volume 3, Sorting and Searching
- Knuth
- 1998
(Show Context)
Citation Context ...prevalence of hash-based algorithms and data structures in networking and other areas. At the same time, the field of hashing, which has enjoyed a long and rich history in computer science (see e.g., =-=[26]-=-), has also enjoyed something of a theoretical renaissance. Arguably, this burst of activity began with the demonstration of the power of multiple choices: by giving each item multiple possible hash l... |

708 |
Universal classes of hash functions
- Carter, Wegman
- 1977
(Show Context)
Citation Context ...y random, even though this is unrealistic. But in practice, such analysis generally turns out to be accurate, even when weak hash functions, such as pairwise independent (or universal) hash functions =-=[9]-=-, are used. The proposed resolution was to model the data as coming from a random source, where the i’th item Xi has at least some k bits of entropy (specifically, Renyi entropy) conditioned on the pr... |

261 | Balanced allocations
- Azar, Broder, et al.
- 1999
(Show Context)
Citation Context ...power of multiple choices: by giving each item multiple possible hash locations, and storing it in the least loaded, remarkably balanced loads can be obtained, yielding quite efficient lookup schemes =-=[4, 7,21,28,35]-=-. An extension of this idea, cuckoo hashing, further allows items to be moved among its multiple choices to better avoid collisions, improving memory utilization even further. In this brief note I pla... |

216 |
Storing a sparse table with O(1) worst case access time
- FREDMAN, KOMLÓS, et al.
- 1984
(Show Context)
Citation Context ...ollisions seems quite powerful, although it is not commonly studied in theoretical work. (Interestingly, though, one can think of the seminal work on perfect hashing of Fredman, Komlós, and Szemerédi =-=[18]-=- in this context.) The issue of the right scale of the additional space seems to be an interesting question. For example, in other work, we have alternatively suggested using a CAM as a queue for pend... |

132 | Cuckoo hashing
- Pagh, Rodler
- 2001
(Show Context)
Citation Context ...ith an invited talk for the 2009 ESA conference in Denmark. The topic seems apropos; the paper introducing cuckoo hashing by Pagh and Rodler appeared in the 2001 ESA conference, also held in Denmark! =-=[31,32]-=- Also for this reason, the focus here will be primarily on theoretical results and problems. There is of course also recently a great deal of interesting work in hashing combining theory and practice,... |

88 |
auf der Heide. Efficient PRAM simulation on a distributed memory machine
- Karp, Luby, et al.
- 1992
(Show Context)
Citation Context ...power of multiple choices: by giving each item multiple possible hash locations, and storing it in the least loaded, remarkably balanced loads can be obtained, yielding quite efficient lookup schemes =-=[4, 7,21,28,35]-=-. An extension of this idea, cuckoo hashing, further allows items to be moved among its multiple choices to better avoid collisions, improving memory utilization even further. In this brief note I pla... |

83 | How asymmetry helps load balancing
- Vocking
- 1999
(Show Context)
Citation Context ...power of multiple choices: by giving each item multiple possible hash locations, and storing it in the least loaded, remarkably balanced loads can be obtained, yielding quite efficient lookup schemes =-=[4, 7,21,28,35]-=-. An extension of this idea, cuckoo hashing, further allows items to be moved among its multiple choices to better avoid collisions, improving memory utilization even further. In this brief note I pla... |

69 | Using multiple hash functions to improve ip lookups
- Broder, Mitzenmacher
- 2001
(Show Context)
Citation Context |

57 | L.: Parallel Randomized Load Balancing
- Adler, Chakrabarti, et al.
- 1995
(Show Context)
Citation Context ...his random partitioning trades additional space for efficiency. For details, see [2]. While there is a fair amount of historical work on parallel hashing and load balancing schemes (see, for example, =-=[1,28]-=-), the significant advances made in the last decade in terms of analysis and understanding of the power to move items suggests that we can obtain both stronger results and tighter analyses in theory f... |

47 | Space efficient hash tables with worst case constant access time
- Fotakis, Pagh, et al.
(Show Context)
Citation Context ... originally introduced with just two choices per items and buckets of unit capacity, it was naturally generalized to situations with more than two choices per bucket and more than one item per bucket =-=[15, 17]-=-. These variations share the properties that they require checking only O(1) memory locations even in the worst case. Hence, in general, we refer to the entire range of variations as cuckoo hashing, a... |

47 |
On universal classes of fast high performance hash functions, their time-space tradeo, and their applications
- Siegel
- 1989
(Show Context)
Citation Context ...an appropriate constant c, the analysis showing expected constant time per operation continues to hold. Pagh and Rodler [32] in fact showed that a hash function family derived from the work of Siegel =-=[34]-=- with limited independence suffices for cuckoo hashing in the case where d = 2. However, these hash functions still appear to be too complex to be utilized in practice. They also experimented with wea... |

36 | Why simple hash functions work: exploiting the entropy in a data stream
- Mitzenmacher, Vadhan
- 2008
(Show Context)
Citation Context ...independent to obtain constant expected time per operation. An alternative direction, taken by Mitzenmacher and Vadhan, started with the question of why simple hash functions work so well in practice =-=[29]-=-. As mentioned, when analyzing hash-related data structures such as cuckoo hashing, one commonly assumes that the underlying hash functions are completely random, even though this is unrealistic. But ... |

30 | Balanced allocation and dictionaries with tightly packed constant size bins
- Dietzfelbinger, Weidling
- 2005
(Show Context)
Citation Context ... originally introduced with just two choices per items and buckets of unit capacity, it was naturally generalized to situations with more than two choices per bucket and more than one item per bucket =-=[15, 17]-=-. These variations share the properties that they require checking only O(1) memory locations even in the worst case. Hence, in general, we refer to the entire range of variations as cuckoo hashing, a... |

29 | Poly-logarithmic independence fools AC0 circuits
- Braverman
- 2009
(Show Context)
Citation Context ...unctions still appear to be too complex to be utilized in practice. They also experimented with weaker hash functions. Recent advances in the area include the work of [3], where a result by Braverman =-=[6]-=- is used to show that the analysis of cuckoo hashing with a queue holds even with only polylogarithmically-wise independent hash functions. Cohen and Kane [11] demonstrate that 5-independence (which i... |

20 | More Robust Hashing: Cuckoo Hashing with a Stash. To appear
- Kirsch, Mitzenmacher, et al.
- 2008
(Show Context)
Citation Context ... is unsuitable for many applications. The failure rate is smaller with more choices of items [17] or more items per bucket [15], but the high failure probability still remains a potential problem. In =-=[24]-=- we show that one needs only a small, constant-sized stash to greatly reduce the probability of a failure. A stash should be thought of as a small, fully-associative memory, that allows an arbitrary l... |

17 | The power of one move: Hashing schemes for hardware
- Kirsch
- 2008
(Show Context)
Citation Context ... this conjecture has recently been proven in [3] (see also the similar [12]). Finally, in other work, we have considered variants that allow only one move of an item in a hash table on each insertion =-=[23]-=-. The motivation for this work was to consider the benefits of making the minimum possible change to multiplechoice hashing, which is already being used in some hardware solutions, in order to convinc... |

16 | Real-time parallel hashing on the GPU
- Alcantara, Sharf, et al.
- 2009
(Show Context)
Citation Context ...ng hash tables and related data structures, inspired by the development of multi-core processors and other mainstream hardware that allows parallelization, such as graphics processor units (GPUs). In =-=[2]-=-, we design a practical parallel scheme for constructing hash tables on GPUs motivated in part by cuckoo hashing techniques. The setting is offline, with all items available. Essentially, items perfor... |

14 | Succinct data structures for retrieval and approximate membership (extended abstract
- Dietzfelbinger, Pagh
- 2008
(Show Context)
Citation Context ...considerations are needed, an upper bound can be calculated [5]. Lower bounds have been achieved, based on a new approach for designing dictionary and retrieval structures, based on matrix techniques =-=[13]-=-. (See also [33].) These techniques are quite interesting and highly recommended but a full description is beyond the scope of this short note; essentially, one utilizes a full-rank matrix with at mos... |

14 | An optimal bloom filter replacement based on matrix solving
- Porat
- 2009
(Show Context)
Citation Context ...re needed, an upper bound can be calculated [5]. Lower bounds have been achieved, based on a new approach for designing dictionary and retrieval structures, based on matrix techniques [13]. (See also =-=[33]-=-.) These techniques are quite interesting and highly recommended but a full description is beyond the scope of this short note; essentially, one utilizes a full-rank matrix with at most d ones per col... |

12 |
The k-orientability thresholds for Gn,p, in
- Fernholz, Ramachandran
(Show Context)
Citation Context ...to orient each edge so that no vertex has degree more than k. Hence the problem corresponds to the threshold for k-orientability on random graphs, which provides a framework for finding the threshold =-=[8, 16]-=-. Because in the offline case there is no moving of items needed, as items are simply placed, whether these load can be achieved by a natural cuckoo hashing variant in the online setting remains open.... |

12 |
The Power of Two Choices: A Survey of Techniques and Results
- Mitzenmacher, Richa, et al.
- 2001
(Show Context)
Citation Context |

11 |
Using a Queue to De-amortize Cuckoo Hashing in Hardware
- Kirsch, Mitzenmacher
- 2007
(Show Context)
Citation Context ...cale of the additional space seems to be an interesting question. For example, in other work, we have alternatively suggested using a CAM as a queue for pending move operations in a cuckoo hash table =-=[22]-=-. The advantage of this approach is it gives an effective de-amortization of cuckoo hash inserts: by queueing operations, we can arrange for inserts to have worst-case constant time (corresponding to ... |

10 | De-amortized cuckoo hashing: Provable worst-case performance and experimental results
- Arbitman, Naor, et al.
- 2009
(Show Context)
Citation Context ...setting that the queue size is required to scale like O(log n), corresponding to a maximum size achieved by a queue over O(n) steps. For the case of d = 2, this conjecture has recently been proven in =-=[3]-=- (see also the similar [12]). Finally, in other work, we have considered variants that allow only one move of an item in a hash table on each insertion [23]. The motivation for this work was to consid... |

9 | Hash-based techniques for high-speed packet processing
- Kirsch, Mitzenmacher, et al.
- 2009
(Show Context)
Citation Context ...ill be primarily on theoretical results and problems. There is of course also recently a great deal of interesting work in hashing combining theory and practice, as detailed for example in the survey =-=[25]-=-. ⋆ Supported in part by NSF grants CNS-0721491 and research grants from the Cisco University Research Program, Yahoo! University Research Program, and Google University Research Program.2 Background... |

9 | History-independent cuckoo hashing
- Naor, Segev, et al.
- 2008
(Show Context)
Citation Context ...ns with millions of users. The original motivation was for potential applications to routers, and applications of this result to devices using history-independent hash tables have also been suggested =-=[30]-=-. This idea of allowing a small amount of additional space to handle collisions seems quite powerful, although it is not commonly studied in theoretical work. (Interestingly, though, one can think of ... |

6 | Bipartite Random Graphs and Cuckoo Hashing
- Kutzelnigg
- 2006
(Show Context)
Citation Context ...eral, we refer to the entire range of variations as cuckoo hashing, and clarify in context when necessary. For cuckoo hashing the case of d = 2 choices with one item per bucket is now well understood =-=[32,27]-=-, the cases with more choices and more items per bucket have left many remaining open questions [15,17]. The case of d = 2 is so well understood because there is a direct correspondence to random grap... |

5 |
The random graph threshold for korientability and a fast algorithm for optimal multiple-choice allocation
- Cain, Sanders, et al.
(Show Context)
Citation Context ...to orient each edge so that no vertex has degree more than k. Hence the problem corresponds to the threshold for k-orientability on random graphs, which provides a framework for finding the threshold =-=[8, 16]-=-. Because in the offline case there is no moving of items needed, as items are simply placed, whether these load can be achieved by a natural cuckoo hashing variant in the online setting remains open.... |

4 |
Tight bounds for hashing block sources
- Chung, Vadhan
- 2008
(Show Context)
Citation Context ...cient randomness in the data. The implications of this model apply to cuckoo hashing as well as other hashing-based algorithms and data structures. Improvements on the bounds of [29] are developed in =-=[10]-=-, As shown by Dietzfelbinger and Schellbach, however, one cannot use this insight blindly. They demonstrate that natural families of universal hash functions, namely multiplicative hash functions and ... |

4 | An analysis of random-walk cuckoo hashing
- Frieze, Melsted, et al.
(Show Context)
Citation Context ... with high probability over the choices of the cuckoo hashing algorithm any insertion will, with high probability, take polylogarithmic time under suitable loads for large enough numbers of choices d =-=[19]-=-. The argument breaks into a pair of steps: first, most buckets have an augmenting path of length at most O(log log n) to an empty bucket; and second, the graph representing the cuckoo hashing process... |

3 |
Bounds on the independence required for cuckoo hashing
- Cohen, Kane
- 2009
(Show Context)
Citation Context ... work of [3], where a result by Braverman [6] is used to show that the analysis of cuckoo hashing with a queue holds even with only polylogarithmically-wise independent hash functions. Cohen and Kane =-=[11]-=- demonstrate that 5-independence (which is slightly different than but close to 5-wise independence) is insufficient for constant amortized cost per operation for cuckoo hashing with d = 2 choices, bu... |

3 | Two-way chaining with reassignment
- Dalal, Devroye, et al.
(Show Context)
Citation Context ...e is required to scale like O(log n), corresponding to a maximum size achieved by a queue over O(n) steps. For the case of d = 2, this conjecture has recently been proven in [3] (see also the similar =-=[12]-=-). Finally, in other work, we have considered variants that allow only one move of an item in a hash table on each insertion [23]. The motivation for this work was to consider the benefits of making t... |

3 | On risks of using cuckoo hashing with simple universal hash classes
- Dietzfelbinger, Schellbach
- 2009
(Show Context)
Citation Context ...ions, namely multiplicative hash functions and standard linear hash functions over a prime field, fail even for fully random key sets, when the key set is sufficiently dense over the universe of keys =-=[14]-=-. In such cases, there is not sufficient entropy for the results of [29] to hold, so there is no contradiction. The implications of these results to practical settings certainly appear to be a worthy ... |

1 |
Balanced allocations: Balls-into-bins revisited and chains-into-bins. CDAM Research Report Series
- Batu, Berenbrink, et al.
- 2007
(Show Context)
Citation Context ...er bounds on the theshold can found by again viewing the problem as an orientation problem on random hypergraphs, and while some additional considerations are needed, an upper bound can be calculated =-=[5]-=-. Lower bounds have been achieved, based on a new approach for designing dictionary and retrieval structures, based on matrix techniques [13]. (See also [33].) These techniques are quite interesting a... |