## Efficient hashing with lookups in two memory accesses, in: 16th

Venue: | SODA, ACM-SIAM |

Citations: | 18 - 3 self |

### BibTeX

@ARTICLE{Panigrahy_efficienthashing,

author = {Rina Panigrahy},

title = {Efficient hashing with lookups in two memory accesses, in: 16th},

journal = {SODA, ACM-SIAM},

year = {},

pages = {830--839}

}

### Years of Citing Articles

### OpenURL

### Abstract

The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the concept of two-way hashing where the largest bucket contains O(log log n) balls with high probability. The hash look up will now search in both the buckets an item hashes to. Since an item may be placed in one of two buckets, we could potentially move an item after it has been initially placed to reduce maximum load. Using this fact, we present a simple, practical hashing scheme that maintains a maximum load of 2, with high probability, while achieving high memory utilization. In fact, with n buckets, even if the space for two items are pre-allocated per bucket, as may be desirable in hardware implementations, more than n items can be stored giving a high memory utilization. Assuming truly random hash functions, we prove the following properties for our hashing scheme. • Each lookup takes two random memory accesses, and reads at most two items per access. • Each insert takes O(log n) time and up to log log n+ O(1) moves, with high probability, and constant time in expectation. • Maintains 83.75 % memory utilization, without requiring dynamic allocation during inserts. We also analyze the trade-off between the number of moves performed during inserts and the maximum load on a bucket. By performing at most h moves, we can maintain a maximum load of O(hlogl((~og~og:n/h)). So, even by performing one move, we achieve a better bound than by performing no moves at all. 1

### Citations

266 | Balanced allocations
- Azar, Broder, et al.
(Show Context)
Citation Context ...pOcisco, com. E-mail: is that, asymptotically, if n balls are thrown into n bins independently and randomly then the largest bin has (1 + o(1)) In n~ In Inn balls, with high probability. Azar et. al. =-=[1]-=- showed that instead of using a single hash function, if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This lead... |

133 | Every monotone graph property has a sharp threshold
- Friedgut, Kalai
- 1996
(Show Context)
Citation Context ...the threshold value have none. This is because the existence of a dense subgraph is a monotone property, and all such properties were shown to display a sharp threshold behavior by Friedgut and Kalai =-=[14]-=-. A closely related property, the existence of a k-core in random graphs, has been studied extensively and the threshold values have been pinned down exactly. A k-core is a maximal nonempty subgraph w... |

112 | Sudden emergence of a giant k-core in a random graph
- Pittel, Spencer, et al.
- 1996
(Show Context)
Citation Context ...ore in random graphs, has been studied extensively and the threshold values have been pinned down exactly. A k-core is a maximal nonempty subgraph where every node has degree at least k. Pittel et al =-=[21]-=- showed that for the existence of a 3-core the critical value is about 3.35. Note that existence of a subgraph with density greater than 2 implies existence of a 3-core. This is because by iteratively... |

104 | The power of two random choices: A survey of the techniques and results - Mitzenmacher, Richa, et al. - 2000 |

59 | Parallel randomized load balancing
- Adler, Chakrabarti, et al.
- 1995
(Show Context)
Citation Context ...by Dietzfelbinger et al. in [8] and [10]. In practice, however, these algorithms are more complex to implement than cuckoo hashing. Extensive work has been done in the area of parallel balls and bins =-=[2]-=- and the related study of algorithms to enmlate shared memory machines (as for example, PRAMs) on distributed memory machines (DMIVis) [11] [5] [18] [22]. This setting involves a parallel game of plac... |

48 | Space efficient hash tables with worst case constant access time
- Fotakis, Pagh, et al.
(Show Context)
Citation Context ..., however, they allow only one item per bucket. With two hash tables this requires 100% memory overhead. They also show that the amortized insert time with cuckoo hashing is a constant. Fotakis et al =-=[16]-=- generalized the method to d-ary hashing, using d hash tables, and truly random hash functions, but still allowing only one item per bucket. They showed that with e memory overhead, one can support ha... |

40 |
auf der Heide, A new universal class of hash functions and dynamic hashing in real time
- Dietzfelbinger, Meyer
- 1990
(Show Context)
Citation Context ...cludes the first static dictionary data structure with constant look up time by Fredman, Komlos and Szemeredi [15] that was generalized to a dynamic data structure by Dietzfelbinger et al. in [8] and =-=[10]-=-. In practice, however, these algorithms are more complex to implement than cuckoo hashing. Extensive work has been done in the area of parallel balls and bins [2] and the related study of algorithms ... |

33 | V.: Exploiting Storage Redundancy to Speed up Randomized Shared Memory Simulations
- Heide, Scheideler, et al.
- 1996
(Show Context)
Citation Context ...has been done in the area of parallel balls and bins [2] and the related study of algorithms to enmlate shared memory machines (as for example, PRAMs) on distributed memory machines (DMIVis) [11] [5] =-=[18]-=- [22]. This setting involves a parallel game of placing balls in bins (the so-called collision game) where all n balls participate in rounds of parallel attempts to assign balls to bins. In each round... |

23 |
Endre Szemeredi. Storing a sparse table with O(1) worst case access time
- Fredman, Komlos
- 1984
(Show Context)
Citation Context ...than about 90/e, inserts can be performed in conslant expected time. Other related work includes the first static dictionary data structure with constant look up time by Fredman, Komlos and Szemeredi =-=[15]-=- that was generalized to a dynamic data structure by Dietzfelbinger et al. in [8] and [10]. In practice, however, these algorithms are more complex to implement than cuckoo hashing. Extensive work has... |

22 | Almost random graphs with simple hash functions
- Dietzfelbinger, Woelfel
(Show Context)
Citation Context ...ut are O(log n)-way independent. Setting h = O(loglogn) implies that we can maintain a constant maximum bucket size even if O(log n)-way independent hash functions are used. Several recent works [19] =-=[12]-=- demonstrate how such functions can be evaluated in constant time and implemented efficiently without using much storage. This idea of moving items has been used earlier in cuckoo hashing [20[, howeve... |

21 | Shared memory simulations with triple-logarithmic delay
- Czumaj, Heide, et al.
- 1995
(Show Context)
Citation Context ...ork has been done in the area of parallel balls and bins [2] and the related study of algorithms to enmlate shared memory machines (as for example, PRAMs) on distributed memory machines (DMIVis) [11] =-=[5]-=- [18] [22]. This setting involves a parallel game of placing balls in bins (the so-called collision game) where all n balls participate in rounds of parallel attempts to assign balls to bins. In each ... |

20 | E.: On balls and bins with deletions - Cole, Frieze, et al. - 1998 |

19 | Perfectly balanced allocation - Czumaj, Riley, et al. - 2003 |

19 | Cuckoo Hashing: Further Analysis - Devroye, Morin |

17 |
Uniform hashing in constant time and linear space
- Pagh, Pagh
(Show Context)
Citation Context ...dom but are O(log n)-way independent. Setting h = O(loglogn) implies that we can maintain a constant maximum bucket size even if O(log n)-way independent hash functions are used. Several recent works =-=[19]-=- [12] demonstrate how such functions can be evaluated in constant time and implemented efficiently without using much storage. This idea of moving items has been used earlier in cuckoo hashing [20[, h... |

9 |
Randomized allocation processes.” Random Structures and Algorithms 18(2001):297–331
- Czumaj, Stemann
(Show Context)
Citation Context ...as been initially placed to reduce maximum load. While it was known that if all the random choices are given in advance, balls could be assigned to bins with a maximum load of 2 with high probability =-=[6]-=-, we show that this can be achieved on line while supporting hash up date operations. In fact, even more than n, up to 1.67n, items can be stored in n buckets, with a maximum load of two items, by per... |

5 |
Fast concurrent access to parallel disks. Algorithmica, 35(1):21-55,2003. A Preliminary version appeared in SODA 2000
- Sanders, Egner, et al.
(Show Context)
Citation Context ...een done in the area of parallel balls and bins [2] and the related study of algorithms to enmlate shared memory machines (as for example, PRAMs) on distributed memory machines (DMIVis) [11] [5] [18] =-=[22]-=-. This setting involves a parallel game of placing balls in bins (the so-called collision game) where all n balls participate in rounds of parallel attempts to assign balls to bins. In each round, you... |

3 |
Cache-friendly dictionary implementations with constant lookup time and small space overhead
- Dietzfelbinger, Weidling
(Show Context)
Citation Context .... At the same time, we should point out that our stated memory utilization is proved assuming that the hash flmctions used are truly random. Another recent closely related but as yet unpublished work =-=[13]-=- studies the same algorithm as ours but for larger bucket sizes. They show that with buckets of size O(1/e), and two hash tables, a dictionary data structure can be maintained with e fraction space ov... |

3 |
Using Multiple Hash Functions to Improve IP
- Broder, Mitzenmacher
- 2001
(Show Context)
Citation Context ...he hash look up will now search in both the buckets an item hashes to. So dramatic is this improvement that it can be used in practice to efficiently implement hash lookups in packet routing hardware =-=[3]-=-. The two hash lookups can be parallelized by placing two different hash tables in separate memory components. ∗ Cisco Systems, San Jose, CA 95134. E-mail: rinap@cisco.com. 1Note that since an item m... |

1 |
Using Multiple Hash Functions to hnprove IP
- Broder, Mitzenmacher
- 2001
(Show Context)
Citation Context ...he hash look up will now search in both the buckets an item hashes to. So dramatic is this improvement that it can be used in practice to efficiently implement hash lookups in packet routing hardware =-=[3]-=-. The two hash lookups can be parallelized by placing two different hash tables in separate memory components. However, to simplify our presentation and analysis, we will assume that only one hash tab... |

1 |
Mehlhorn, Friedhehn Meyer auf der
- Dietzfelbinger, Karlin, et al.
- 1994
(Show Context)
Citation Context ... work includes the first static dictionary data structure with constant look up time by Fredman, Komlos and Szemeredi [15] that was generalized to a dynamic data structure by Dietzfelbinger et al. in =-=[8]-=- and [10]. In practice, however, these algorithms are more complex to implement than cuckoo hashing. Extensive work has been done in the area of parallel balls and bins [2] and the related study of al... |

1 | Cuckoo Hashing. JournM of Algorithms 51 (2004), p. 122-144. A preliminary version appeared - Pagh, Rodler - 2001 |