Results 1  10
of
15
Cuckoo hashing
 JOURNAL OF ALGORITHMS
, 2001
"... We present a simple dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. (Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738–761, 1994). The space usage is similar to that ..."
Abstract

Cited by 199 (7 self)
 Add to MetaCart
We present a simple dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. (Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738–761, 1994). The space usage is similar to that of binary search trees, i.e., three words per key on average. Besides being conceptually much simpler than previous dynamic dictionaries with worst case constant lookup time, our data structure is interesting in that it does not use perfect hashing, but rather a variant of open addressing where keys can be moved back in their probe sequences. An implementation inspired by our algorithm, but using weaker hash functions, is found to be quite practical. It is competitive with the best known dictionaries having an average case (but no nontrivial worst case) guarantee.
On risks of using cuckoo hashing with simple universal hash classes
 In Proc. 20th ACM/SIAM Symposium on Discrete Algorithms (SODA
, 2009
"... Cuckoo hashing, introduced by Pagh and Rodler [10], is a dynamic dictionary data structure for storing a set S of n keys from a universe U, with constant lookup time and amortized expected constant insertion time. For the analysis, space (2+ε)n and Ω(log n)wise independence of the hash functions is ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Cuckoo hashing, introduced by Pagh and Rodler [10], is a dynamic dictionary data structure for storing a set S of n keys from a universe U, with constant lookup time and amortized expected constant insertion time. For the analysis, space (2+ε)n and Ω(log n)wise independence of the hash functions is sufficient. In experiments mentioned in [10], several weaker hash classes worked well; however, a certain simple multiplicative hash family worked badly. In this paper, we prove that the failure probability is high when cuckoo hashing is run with the multiplicative class or with the very common class of linear hash functions over a prime field, even if space 4n is provided. The key set S is fully random, but it must be relatively dense in the universe U of all keys (like S  ≥ U  11/12). The bad behavior and the fact that this effect depends on the density of S in U can also be observed in experiments. The result transfers to larger universes if the keys are chosen from a suitable smaller domain. Viewed from a different perspective, our result illustrates that care must be taken when applying a recent result of Mitzenmacher and Vadhan ([12], SODA 2008) proving good behavior of universal hash classes in combination with key sets that have some entropy. Their result is applicable to cuckoo hashing. A technical hypothesis in [12], namely the assumption that either the “collision probability ” or the “maximum probability ” is small, translates into the condition that S  is relatively small in comparison to U. Our result shows that the result from [12] on 2universal classes ceases to hold if S/U  is not small enough, even for very common 2universal hash classes and fully random key sets. 1
The limits of buffering: A tight lower bound for dynamic membership in the external memory model
 In Proc. ACM Symposium on Theory of Computing
, 2010
"... We study the dynamic membership (or dynamic dictionary) problem, which is one of the most fundamental problems in data structures. We study the problem in the external memory model with cell size b bits and cache size m bits. We prove that if the amortized cost of updates is at most 0.999 (or any ot ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
We study the dynamic membership (or dynamic dictionary) problem, which is one of the most fundamental problems in data structures. We study the problem in the external memory model with cell size b bits and cache size m bits. We prove that if the amortized cost of updates is at most 0.999 (or any other constant < 1), then the query cost must be Ω(logb log n (n/m)), where n is the number of elements in the dictionary. In contrast, when the update time is allowed to be 1 + o(1), then a bit vector or hash table give query time O(1). Thus, this is a threshold phenomenon for data structures. This lower bound answers a folklore conjecture of the external memory community. Since almost any data structure task can solve membership, our lower bound implies a dichotomy between two alternatives: (i) make the amortized update time at least 1 (so the data structure does not buffer, and we lose one of the main potential advantages of the cache), or (ii) make the query time at least roughly logarithmic in n. Our result holds even when the updates and queries are chosen uniformly at random and there are no deletions; it holds for randomized data structures, holds when the universe size is O(n), and does not make any restrictive assumptions such as indivisibility. All of the lower bounds we prove hold regardless of the space consumption of the data structure, while the upper bounds only need linear space. The lower bound has some striking implications for external memory data structures. It shows that the query complexities of many problems such as 1Drange counting, predecessor, rankselect, and many others, are all the same
Lossy Dictionaries
 In ESA ’01: Proceedings of the 9th Annual European Symposium on Algorithms
, 2001
"... Bloom filtering is an important technique for space efficient storage of a conservative approximation of a set S. The set stored may have up to some specified number of false positive members, but all elements of S are included. In this paper we consider lossy dictionaries that are also allowed to h ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Bloom filtering is an important technique for space efficient storage of a conservative approximation of a set S. The set stored may have up to some specified number of false positive members, but all elements of S are included. In this paper we consider lossy dictionaries that are also allowed to have false negatives, i.e., leave out elements of S. The aim is to maximize the weight of included keys within a given space constraint. This relaxation allows a very fast and simple data structure making almost optimal use of memory. Being more time efficient than Bloom filters, we believe our data structure to be well suited for replacing Bloom filters in some applications. Also, the fact that our data structure supports information associated to keys paves the way for new uses, as illustrated by an application in lossy image compression.
On the Cell Probe Complexity of Dynamic Membership
"... We study the dynamic membership problem, one of the most fundamental data structure problems, in the cell probe model with an arbitrary cell size. We consider a cell probe model equipped with a cache that consists of at least a constant number of cells; reading or writing the cache is free of charge ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
We study the dynamic membership problem, one of the most fundamental data structure problems, in the cell probe model with an arbitrary cell size. We consider a cell probe model equipped with a cache that consists of at least a constant number of cells; reading or writing the cache is free of charge. For nearly all common data structures, it is known that with sufficiently large cells together with the cache, we can significantly lower the amortized update cost to o(1). In this paper, we show that this is not the case for the dynamic membership problem. Specifically, for any deterministic membership data structure under a random input sequence, if the expected average query cost is no more than 1+δ for some small constant δ, we prove that the expected amortized update cost must be at least Ω(1), namely, it does not benefit from large block writes (and a cache). The space the structure uses is irrelevant to this lower bound. We also extend this lower bound to randomized membership structures, by using a variant of Yao’s minimax principle. Finally, we show that the structure cannot do better even if it is allowed to answer a query mistakenly with a small constant probability. 1
TWOWAY CHAINING WITH REASSIGNMENT
"... Abstract. We present an algorithm for hashing ⌊ αn ⌋ elements into a table with n separate chains that requires O(1) deterministic worstcase insert time, and O(1) expected worstcase search time for constant α. We exploit the connection between twoway chaining and random graph theory in our techni ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We present an algorithm for hashing ⌊ αn ⌋ elements into a table with n separate chains that requires O(1) deterministic worstcase insert time, and O(1) expected worstcase search time for constant α. We exploit the connection between twoway chaining and random graph theory in our techniques.
Explicit Deterministic Constructions for Membership in the Bitprobe Model
"... We look at timespace tradeoffs for the static membership problem in the bitprobe model. The problem is to represent a set of size up to n from a universe of size m using a small number of bits so that given an element of the universe, its membership in the set can be determined with as few bit ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We look at timespace tradeoffs for the static membership problem in the bitprobe model. The problem is to represent a set of size up to n from a universe of size m using a small number of bits so that given an element of the universe, its membership in the set can be determined with as few bit probes to the representation as possible.
OneProbe Search
, 2002
"... We consider dictionaries that perform lookups by probing a single word of memory, knowing only the size of the data structure. We describe a randomized dictionary where a lookup returns the correct answer with probability 1 − ɛ, and otherwise returns “don’t know”. The lookup procedure uses an expa ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We consider dictionaries that perform lookups by probing a single word of memory, knowing only the size of the data structure. We describe a randomized dictionary where a lookup returns the correct answer with probability 1 − ɛ, and otherwise returns “don’t know”. The lookup procedure uses an expander graph to select the memory location to probe. Recent explicit expander constructions are shown to yield space usage far smaller than what would be required using a deterministic lookup procedure. Our data structure supports efficient deterministic updates, exhibiting new probabilistic guarantees on dictionary running time.
On the korientability of random graphs
, 2009
"... Let G(n, m) be an undirected random graph with n vertices and m multiedges that may include loops, where each edge is realized by choosing its two vertices independently and uniformly at random with replacement from the set of all n vertices. The random graph G(n, m) is said to be korientable, wher ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Let G(n, m) be an undirected random graph with n vertices and m multiedges that may include loops, where each edge is realized by choosing its two vertices independently and uniformly at random with replacement from the set of all n vertices. The random graph G(n, m) is said to be korientable, where k ≥ 2 is an integer, if there exists an orientation of the edges such that the maximum outdegree is at most k. Let ck = sup {c: G(n, cn) is korientable w.h.p.}. We prove that for k large enough, 1 − 2 k exp −k + 1 + e −k/4) < ck/k < 1 − exp