Results 1 -
7 of
7
Space Efficient Hash Tables With Worst Case Constant Access Time
- In STACS
, 2003
"... We generalize Cuckoo Hashing [23] to d-ary Cuckoo Hashing and show how this yields a simple hash table data structure that stores n elements in (1 + ffl) n memory cells, for any constant ffl ? 0. Assuming uniform hashing, accessing or deleting table entries takes at most d = O(ln ffl ) probes ..."
Abstract
-
Cited by 34 (4 self)
- Add to MetaCart
We generalize Cuckoo Hashing [23] to d-ary Cuckoo Hashing and show how this yields a simple hash table data structure that stores n elements in (1 + ffl) n memory cells, for any constant ffl ? 0. Assuming uniform hashing, accessing or deleting table entries takes at most d = O(ln ffl ) probes and the expected amortized insertion time is constant. This is the first dictionary that has worst case constant access time and expected constant update time, works with (1 + ffl) n space, and supports satellite information. Experiments indicate that d = 4 choices suffice for ffl 0:03. We also describe variants of the data structure that allow the use of hash functions that can be evaluted in constant time.
Why simple hash functions work: Exploiting the entropy in a data stream
- In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms
, 2008
"... Hashing is fundamental to many algorithms and data structures widely used in practice. For theoretical analysis of hashing, there have been two main approaches. First, one can assume that the hash function is truly random, mapping each data item independently and uniformly to the range. This idealiz ..."
Abstract
-
Cited by 27 (6 self)
- Add to MetaCart
Hashing is fundamental to many algorithms and data structures widely used in practice. For theoretical analysis of hashing, there have been two main approaches. First, one can assume that the hash function is truly random, mapping each data item independently and uniformly to the range. This idealized model is unrealistic because a truly random hash function requires an exponential number of bits to describe. Alternatively, one can provide rigorous bounds on performance when explicit families of hash functions are used, such as 2-universal or O(1)-wise independent families. For such families, performance guarantees are often noticeably weaker than for ideal hashing. In practice, however, it is commonly observed that weak hash functions, including 2-universal hash functions, perform as predicted by the idealized analysis for truly random hash functions. In this paper, we try to explain this phenomenon. We demonstrate that the strong performance of universal hash functions in practice can arise naturally from a combination of the randomness of the hash function and the data. Specifically, following the large body of literature on random sources and randomness extraction, we model the data as coming from a “block source, ” whereby
Efficient hashing with lookups in two memory accesses, in: 16th
- SODA, ACM-SIAM
"... The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the c ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the concept of two-way hashing where the largest bucket contains O(log log n) balls with high probability. The hash look up will now search in both the buckets an item hashes to. Since an item may be placed in one of two buckets, we could potentially move an item after it has been initially placed to reduce maximum load. Using this fact, we present a simple, practical hashing scheme that maintains a maximum load of 2, with high probability, while achieving high memory utilization. In fact, with n buckets, even if the space for two items are pre-allocated per bucket, as may be desirable in hardware implementations, more than n items can be stored giving a high memory utilization. Assuming truly random hash functions, we prove the following properties for our hashing scheme. • Each lookup takes two random memory accesses, and reads at most two items per access. • Each insert takes O(log n) time and up to log log n+ O(1) moves, with high probability, and constant time in expectation. • Maintains 83.75 % memory utilization, without requiring dynamic allocation during inserts. We also analyze the trade-off between the number of moves performed during inserts and the maximum load on a bucket. By performing at most h moves, we can maintain a maximum load of O(hlogl((~og~og:n/h)). So, even by performing one move, we achieve a better bound than by performing no moves at all. 1
Strongly history-independent hashing with applications
- In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
, 2007
"... We present a strongly history independent (SHI) hash table that supports search in O(1) worst-case time, and insert and delete in O(1) expected time using O(n) data space. This matches the bounds for dynamic perfect hashing, and improves on the best previous results by Naor and Teague on history ind ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We present a strongly history independent (SHI) hash table that supports search in O(1) worst-case time, and insert and delete in O(1) expected time using O(n) data space. This matches the bounds for dynamic perfect hashing, and improves on the best previous results by Naor and Teague on history independent hashing, which were either weakly history independent, or only supported insertion and search (no delete) each in O(1) expected time. The results can be used to construct many other SHI data structures. We show straightforward constructions for SHI ordered dictionaries: for n keys from {1,..., n k} searches take O(log log n) worst-case time and updates (insertions and deletions) O(log log n) expected time, and for keys in the comparison model searches take O(log n) worst-case time and updates O(log n) expected time. We also describe a SHI data structure for the order-maintenance problem. It supports comparisons in O(1) worst-case time, and updates in O(1) expected time. All structures use O(n) data space. 1
History-Independent Cuckoo Hashing
"... Cuckoo hashing is an efficient and practical dynamic dictionary. It provides expected amortized constant update time, worst case constant lookup time, and good memory utilization. Various experiments demonstrated that cuckoo hashing is highly suitable for modern computer architectures and distribute ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Cuckoo hashing is an efficient and practical dynamic dictionary. It provides expected amortized constant update time, worst case constant lookup time, and good memory utilization. Various experiments demonstrated that cuckoo hashing is highly suitable for modern computer architectures and distributed settings, and offers significant improvements compared to other schemes. In this work we construct a practical history-independent dynamic dictionary based on cuckoo hashing. In a history-independent data structure, the memory representation at any point in time yields no information on the specific sequence of insertions and deletions that led to its current content, other than the content itself. Such a property is significant when preventing unintended leakage of information, and was also found useful in several algorithmic settings. Our construction enjoys most of the attractive properties of cuckoo hashing. In particular, no dynamic memory allocation is required, updates are performed in expected amortized constant time, and membership queries are performed in worst case constant time. Moreover, with high probability, the lookup procedure queries only two memory entries which are independent and can be queried in parallel. The approach underlying our construction is to enforce a canonical memory representation on cuckoo hashing. That is, up to the initial randomness, each set of elements has a unique memory representation.
Incremental hashing in state space search
- In Workshop ”New Results in Planning, Scheduling and Design
, 2004
"... Abstract. State memorization is essential for state-space search to avoid redundant expansions and hashing serves as a method to, address store and retrieve states efficiently. In this paper we introduce incremental state hashing to compute hash values in constant time. The method will be most effec ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. State memorization is essential for state-space search to avoid redundant expansions and hashing serves as a method to, address store and retrieve states efficiently. In this paper we introduce incremental state hashing to compute hash values in constant time. The method will be most effective in guided depth-first search traversals of state space graphs, like in IDA*, where the computation of the set of successors and their heuristic estimates is extremely fast: heuristic values are often computed incrementally or retrieved from pre-computed pattern database tables, and backtracking keeps the changes in the state representation vector during the exploration small. The approach quickly decides if a given state is not present in a hash table, and accelerates successful search. It can further accelerate perfect hashing for pattern storage and look-up. If, for a better coverage of the state space, partial search methods without collision resolving is used, we establish another benefit for incremental state hashing. We exemplify our considerations in the (n 2 − 1)-Puzzle, in action planning, and conduct experiments in Atomix. 1

