## LINEAR PROBING WITH 5-WISE INDEPENDENCE ∗

Citations: | 1 - 1 self |

### BibTeX

@MISC{Pagh_linearprobing,

author = {Anna Pagh and Rasmus Pagh and Milan and Ru Zi Ć},

title = {LINEAR PROBING WITH 5-WISE INDEPENDENCE ∗},

year = {}

}

### OpenURL

### Abstract

Abstract. Hashing with linear probing dates back to the 1950s, and is among the most studied algorithms for storing (key,value) pairs. In recent years it has become one of the most important hash table organizations since it uses the cache of modern computers very well. Unfortunately, previous analyses rely either on complicated and space consuming hash functions, or on the unrealistic assumption of free access to a hash function with random and independent function values. Carter and Wegman, in their seminal paper on universal hashing, raised the question of extending their analysis to linear probing. However, we show in this paper that linear probing using a 2-wise independent hash function may have expected logarithmic cost per operation. Recently, Pǎtra¸scu and Thorup have shown that also 3- and 4-wise independent hash functions may give rise to logarithmic expected query time. On the positive side, we show that 5-wise independence is enough to ensure constant expected time per operation. This resolves the question of finding a space and time efficient hash function that provably ensures good performance for hashing with linear probing.

### Citations

723 |
Universal classes of hash functions
- Carter, Wegman
- 1979
(Show Context)
Citation Context ...d choice of hash function, Heileman and Luo [5] advice against linear probing for general-purpose use. 1.2. Analysis using limited randomness. In 1977, Carter and Wegman’s notion of universal hashing =-=[3]-=- initiated a new era in the design of hashing algorithms, where explicit and efficient ways of choosing hash functions replaced the assumption of full randomness. The big insight was that in many case... |

364 |
J.L.: New Hash Functions and Their Use in Authentication and Set Equality
- Wegman, Carter
- 1981
(Show Context)
Citation Context ...nce it is mentioned here together with the double hashing probe sequence, we believe that it refers to linear probing.4 A. PAGH, R. PAGH AND M. RU ˇ ZI Ć Polynomial hash functions. Carter and Wegman =-=[19]-=- observed that the family of degree k − 1 polynomials in any finite field is k-wise independent. Specifically, for any prime p we may use the field defined by arithmetic modulo p to get a family of fu... |

142 | Cuckoo hashing
- Pagh, Rodler
- 2004
(Show Context)
Citation Context ...attern, which means that inspecting several consecutive memory locations does not take significantly more time than inspecting a single, randomly chosen memory location. In fact, experimental studies =-=[1, 5, 10]-=- have found linear probing to be the fastest hash table organization for hash tables that are moderately filled (30-70%). While linear probing operations are known to require more instructions than th... |

111 | Sorting and Searching - Knuth - 1973 |

88 | A complexity theory of efficient parallel algorithms - Kruskal, Rudolph, et al. - 1988 |

61 |
Tabulation based 4-universal hashing with applications to second moment estimation
- Thorup, Zhang
- 2004
(Show Context)
Citation Context ...mily is in the range [α, (1 + r/p)α]. By choosing p much larger than r we can make ¯α arbitrarily close to α. Tabulation-based hash functions. A k-wise independent family proposed by Thorup and Zhang =-=[18]-=- has uniformly distributed function values in [r], and thus ¯α = α. The construction for 5-wise independence is particularly appealing when keys are short (e.g., 32 bits). If we interpret a key as a t... |

30 | Balanced allocation and dictionaries with tightly packed constant size bins
- Dietzfelbinger, Weidling
- 2005
(Show Context)
Citation Context ... result is only of theoretical interest since the associated constants are very large. A construction with similar properties but smaller constants has later been given by Dietzfelbinger and Weidling =-=[4]-=-. A significant drawback of both methods, besides rather complex function evaluation, is the use of random accesses to the memory locations holding the hash function description. This means that we lo... |

28 | On universal classes of extremely random constant-time hash functions
- Siegel
- 2004
(Show Context)
Citation Context ...-wise independence is sufficient to achieve essentially the same performance as in the fully random case. (We use n to denote the number of keys inserted into the hash table.) Another paper by Siegel =-=[15]-=- shows that evaluation of a hash function from a O(log n)-wise independent family requires time Ω(log n) unless the space used to describe the function is n Ω(1) . A family of functions is given that ... |

18 |
The Analysis of Closed Hashing under Limited Randomness (Extended Abstract
- Schmidt, Siegel
- 1990
(Show Context)
Citation Context ...eys and empty locations. However, Knuth used the term to refer to linearLINEAR PROBING WITH 5-WISE INDEPENDENCE 3 linear probing relying only on limited randomness was given by Siegel and Schmidt in =-=[14, 16]-=-. Specifically, they show that O(log n)-wise independence is sufficient to achieve essentially the same performance as in the fully random case. (We use n to denote the number of keys inserted into th... |

14 | Linear probing with constant independence
- Pagh, Pagh, et al.
- 2007
(Show Context)
Citation Context ...ase the key is not present in the hash table. Below we have illustrated insertion (and retrieval) of a key x in a hash table, where other keys are shown as grey balls. h(x) x ∗ This paper is based on =-=[9]-=- that previously appeared in SIAM Journal on Computing. † IT University of Copenhagen, Denmark. 12 A. PAGH, R. PAGH AND M. RU ˇ ZI Ć An implicit assumption made here is that a key fits in a single ha... |

12 | Graph and hashing algorithms for modern architectures: Design and performance
- Black, Martel, et al.
- 1998
(Show Context)
Citation Context ...attern, which means that inspecting several consecutive memory locations does not take significantly more time than inspecting a single, randomly chosen memory location. In fact, experimental studies =-=[1, 5, 10]-=- have found linear probing to be the fastest hash table organization for hash tables that are moderately filled (30-70%). While linear probing operations are known to require more instructions than th... |

12 | Strongly history-independent hashing with applications
- Blelloch, Golovin
(Show Context)
Citation Context ...or small k > 3, in terms of evaluation time [18]. 1.5. Subsequent work. The work of the present paper has been built upon in designing hash tables with additional considerations. Blelloch and Golovin =-=[2]-=- described a linear probing hash table implementation that is strongly history independent. Thorup [17] studied how to get efficient compositions of hash functions for linear probing when the domain o... |

10 |
How caching affects hashing
- Heileman, Luo
- 2005
(Show Context)
Citation Context ...attern, which means that inspecting several consecutive memory locations does not take significantly more time than inspecting a single, randomly chosen memory location. In fact, experimental studies =-=[1, 5, 10]-=- have found linear probing to be the fastest hash table organization for hash tables that are moderately filled (30-70%). While linear probing operations are known to require more instructions than th... |

10 |
The amazing power of pairwise independence
- Wigderson
- 1994
(Show Context)
Citation Context ...) are independent and uniformly random (over the choice of a and b). This “2-wise independence” turns out to be sufficient to guarantee many of the properties possessed by fully random hash functions =-=[3, 20]-=-. More generally, researchers have considered k-wise independence where any k values of the hash function are independent. In their seminal paper, Carter and Wegman state it as an open problem to “Ext... |

8 | String hashing for linear probing
- Thorup
- 2009
(Show Context)
Citation Context ...sity of Copenhagen, Denmark. 12 A. PAGH, R. PAGH AND M. RU ˇ ZI Ć An implicit assumption made here is that a key fits in a single hash table location — for generalization to variable length keys see =-=[17]-=-. Deletion of keys can be performed by moving keys back in the probe sequence in a greedy fashion (ensuring that no key x is moved to before h(x)), until no such move is possible (when a vacant array ... |

6 | Notes on ”open” addressing
- Knuth
- 1963
(Show Context)
Citation Context ...icient memory access pattern makes it very fast in practice. 1.1. Early analysis and heuristic implementations. Linear probing dates back to 1954, but was first analyzed by Knuth in a 1963 memorandum =-=[6]-=- now considered to be the birth of the area of analysis of algorithms [11]. Even if, say, half of the hash table is empty, it is not clear a priori that it will be fast to find a vacant location for a... |

6 | On the k-Independence Required by Linear Probing and Minwise Independence
- Patrascu, Thorup
(Show Context)
Citation Context ...abulation-based hashing [13]. On the lower bound side, Pǎtra¸scu and Thorup showed that there exist 3- and 4-wise independent hash function constructions that result in logarithmic time per operation =-=[12]-=-. This means that there is no hope to improve our results to require lower independence. They also show even worse performance for single operations under 2-wise independence than exhibited in this pa... |

6 | Closed hashing is computable and optimally randomizable with universal hash functions
- Siegel, Schmidt
- 1995
(Show Context)
Citation Context ...eys and empty locations. However, Knuth used the term to refer to linearLINEAR PROBING WITH 5-WISE INDEPENDENCE 3 linear probing relying only on limited randomness was given by Siegel and Schmidt in =-=[14, 16]-=-. Specifically, they show that O(log n)-wise independence is sufficient to achieve essentially the same performance as in the fully random case. (We use n to denote the number of keys inserted into th... |

3 |
Special issue on average case analysis of algorithms
- Prodinger, S
- 1998
(Show Context)
Citation Context ...nalysis and heuristic implementations. Linear probing dates back to 1954, but was first analyzed by Knuth in a 1963 memorandum [6] now considered to be the birth of the area of analysis of algorithms =-=[11]-=-. Even if, say, half of the hash table is empty, it is not clear a priori that it will be fast to find a vacant location for a new key x. The reason is that h(x) may lie in a long interval of occupied... |