The Power of Simple Tabulation Hashing Mihai Pǎtras¸cu AT&T Labs (2011)
BibTeX
@MISC{Thorup11thepower,
author = {Mikkel Thorup},
title = {The Power of Simple Tabulation Hashing Mihai Pǎtras¸cu AT&T Labs},
year = {2011}
}
OpenURL
Abstract
Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Carter and Wegman (STOC’77). Keys are viewed as consisting of c characters. We initialize c tables T1,..., Tc mapping characters to random hash codes. A key x = (x1,..., xq) is hashed to T1[x1] ⊕ · · · ⊕ Tc[xc], where ⊕ denotes xor. While this scheme is not even 4-independent, we show that it provides many of the guarantees that are normally obtained via higher independence, e.g., Chernoff-type concentration, min-wise hashing for estimating set intersection, and cuckoo hashing. An important target of the analysis of algorithms is to determine whether there exist practical schemes, which enjoy mathematical guarantees on performance. Hashing and hash tables are one of the most common inner loops in real-world computation, and are even built-in “unit cost ” operations in high level programming languages that offer associative arrays. Often,







