Results 1  10
of
17
Cuckoo hashing
 Journal of Algorithms
, 2001
"... We present a simple dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. (Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738–761, 1994). The space usage is similar to that ..."
Abstract

Cited by 124 (6 self)
 Add to MetaCart
We present a simple dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. (Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738–761, 1994). The space usage is similar to that of binary search trees, i.e., three words per key on average. Besides being conceptually much simpler than previous dynamic dictionaries with worst case constant lookup time, our data structure is interesting in that it does not use perfect hashing, but rather a variant of open addressing where keys can be moved back in their probe sequences. An implementation inspired by our algorithm, but using weaker hash functions, is found to be quite practical. It is competitive with the best known dictionaries having an average case (but no nontrivial worst case) guarantee. Key Words: data structures, dictionaries, information retrieval, searching, hashing, experiments * Partially supported by the Future and Emerging Technologies programme of the EU
NonExpansive Hashing
 In Proc. 28th STOC
, 1996
"... In a nonexpansive hashing scheme, similar inputs are stored in memory locations which are close. We develop a nonexpansive hashing scheme wherein any set of size O(R 1\Gamma" ) from a large universe may be stored in a memory of size R (any " ? 0, and R ? R 0 (ffl)), and where retrieval takes O(1 ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
In a nonexpansive hashing scheme, similar inputs are stored in memory locations which are close. We develop a nonexpansive hashing scheme wherein any set of size O(R 1\Gamma" ) from a large universe may be stored in a memory of size R (any " ? 0, and R ? R 0 (ffl)), and where retrieval takes O(1) operations. We explain how to use nonexpansive hashing schemes for efficient storage and retrieval of noisy data. A dynamic version of this hashing scheme is presented as well. 1
Isolation, Matching, and Counting: Uniform and Nonuniform Upper Bounds
 Journal of Computer and System Sciences
, 1998
"... We show that the perfect matching problem is in the complexity class SPL (in the nonuniform setting). This provides a better upper bound on the complexity of the matching problem, as well as providing motivation for studying the complexity class SPL. Using similar techniques, we show that counting t ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
We show that the perfect matching problem is in the complexity class SPL (in the nonuniform setting). This provides a better upper bound on the complexity of the matching problem, as well as providing motivation for studying the complexity class SPL. Using similar techniques, we show that counting the number of accepting paths of a nondeterministic logspace machine can be done in NL/poly, if the number of paths is small. This clarifies the complexity of the class LogFew (defined and studied in [BDHM91]). Using derandomization techniques, we then improve this to show that this counting problem is in NL. Determining if our other theorems hold in the uniform setting remains an The material in this paper appeared in preliminary form in papers in the Proceedings of the IEEE Conference on Computational Complexity, 1998, and in the Proceedings of the Workshop on Randomized Algorithms, Brno, 1998. y Supported in part by NSF grants CCR9509603 and CCR9734918. z Supported in part by the ...
Low Redundancy in Static Dictionaries with O(1) Worst Case Lookup Time
 IN PROCEEDINGS OF THE 26TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES AND PROGRAMMING (ICALP '99
, 1999
"... A static dictionary is a data structure for storing subsets of a nite universe U , so that membership queries can be answered efficiently. We study this problem in a unit cost RAM model with word size (log jU j), and show that for nelement subsets, constant worst case query time can be obtained us ..."
Abstract

Cited by 21 (5 self)
 Add to MetaCart
A static dictionary is a data structure for storing subsets of a nite universe U , so that membership queries can be answered efficiently. We study this problem in a unit cost RAM model with word size (log jU j), and show that for nelement subsets, constant worst case query time can be obtained using B +O(log log jU j) + o(n) bits of storage, where B = dlog 2 jUj n e is the minimum number of bits needed to represent all such subsets. For jU j = n log O(1) n the dictionary supports constant time rank queries.
Low Redundancy in Dictionaries with O(1) Worst Case Lookup Time
 IN PROC. 26TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES AND PROGRAMMING (ICALP
, 1998
"... A static dictionary is a data structure for storing subsets of a finite universe U , so that membership queries can be answered efficiently. We study this problem in a unit cost RAM model with word size ze jU j), and show that for nelement subsets, constant worst case query time can be obtain ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
A static dictionary is a data structure for storing subsets of a finite universe U , so that membership queries can be answered efficiently. We study this problem in a unit cost RAM model with word size ze jU j), and show that for nelement subsets, constant worst case query time can be obtained using B +O(log log jU j) + o(n) bits of storage, where B = dlog jU j e is the minimum number of bits needed to represent all such subsets. The solution for dense subsets uses B + O( jU j log log jU j log jU j ) bits of storage, and supports constant time rank queries. In a dynamic setting, allowing insertions and deletions, our techniques give an O(B) bit space usage.
Efficient hashing with lookups in two memory accesses, in: 16th
 SODA, ACMSIAM
"... The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the c ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the concept of twoway hashing where the largest bucket contains O(log log n) balls with high probability. The hash look up will now search in both the buckets an item hashes to. Since an item may be placed in one of two buckets, we could potentially move an item after it has been initially placed to reduce maximum load. Using this fact, we present a simple, practical hashing scheme that maintains a maximum load of 2, with high probability, while achieving high memory utilization. In fact, with n buckets, even if the space for two items are preallocated per bucket, as may be desirable in hardware implementations, more than n items can be stored giving a high memory utilization. Assuming truly random hash functions, we prove the following properties for our hashing scheme. • Each lookup takes two random memory accesses, and reads at most two items per access. • Each insert takes O(log n) time and up to log log n+ O(1) moves, with high probability, and constant time in expectation. • Maintains 83.75 % memory utilization, without requiring dynamic allocation during inserts. We also analyze the tradeoff between the number of moves performed during inserts and the maximum load on a bucket. By performing at most h moves, we can maintain a maximum load of O(hlogl((~og~og:n/h)). So, even by performing one move, we achieve a better bound than by performing no moves at all. 1
Approximate Dictionary Queries
, 1996
"... . Given a set of n binary strings of length m each. We consider the problem of answering dqueries. Given a binary query string ff of length m, a dquery is to report if there exists a string in the set within Hamming distance d of ff. We present a data structure of size O(nm) supporting 1queri ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
. Given a set of n binary strings of length m each. We consider the problem of answering dqueries. Given a binary query string ff of length m, a dquery is to report if there exists a string in the set within Hamming distance d of ff. We present a data structure of size O(nm) supporting 1queries in time O(m) and the reporting of all strings within Hamming distance 1 of ff in time O(m). The data structure can be constructed in time O(nm). A slightly modified version of the data structure supports the insertion of new strings in amortized time O(m). 1 Introduction Let W = fw 1 ; : : : ; wng be a set of n binary strings of length m each, i.e. w i 2 f0; 1g m . The set W is called the dictionary. We are interested in answering d queries, i.e. for any query string ff 2 f0; 1g m to decide if there is a string w i in W with at most Hamming distance d of ff. Minsky and Papert originally raised this problem in [12]. Recently a sequence of papers have considered how to solve thi...
Perfect hashing for strings: Formalization and Algorithms
 IN PROC 7TH CPM
, 1996
"... Numbers and strings are two objects manipulated by most programs. Hashing has been wellstudied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings th ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
Numbers and strings are two objects manipulated by most programs. Hashing has been wellstudied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly efficient sequential/parallel (CRCW PRAM) Las Vegas type algorithms that determine a perfect hash function for substring hashing. For example, given a binary string of length n, one of our algorithms finds a perfect hash function in O(log n) time, O(n) work, and O(n) space; the hash value for any substring can then be computed in O(log log n) time using a single processor. Our approach relies on a novel use of the suffix tree of a string. In implementing our approach, we design optimal parallel algorithms for the problem of determining weighted ancestors on a edgeweighted tree that may be of independent interest.
Faster Deterministic Dictionaries
 In 11 th Annual ACM Symposium on Discrete Algorithms (SODA
, 1999
"... We consider static dictionaries over the universe U = on a unitcost RAM with word size w. Construction of a static dictionary with linear space consumption and constant lookup time can be done in linear expected time by a randomized algorithm. In contrast, the best previous deterministic a ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
We consider static dictionaries over the universe U = on a unitcost RAM with word size w. Construction of a static dictionary with linear space consumption and constant lookup time can be done in linear expected time by a randomized algorithm. In contrast, the best previous deterministic algorithm for constructing such a dictionary with n elements runs in time O(n ) for # > 0. This paper narrows the gap between deterministic and randomized algorithms exponentially, from the factor of to an O(log n) factor. The algorithm is weakly nonuniform, i.e. requires certain precomputed constants dependent on w. A byproduct of the result is a lookup time vs insertion time tradeo# for dynamic dictionaries, which is optimal for a certain class of deterministic hashing schemes.
Balanced Allocation on Graphs
 In Proc. 7th Symposium on Discrete Algorithms (SODA
, 2006
"... It is well known that if n balls are inserted into n bins, with high probability, the bin with maximum load contains (1 + o(1))log n / loglog n balls. Azar, Broder, Karlin, and Upfal [1] showed that instead of choosing one bin, if d ≥ 2 bins are chosen at random and the ball inserted into the least ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
It is well known that if n balls are inserted into n bins, with high probability, the bin with maximum load contains (1 + o(1))log n / loglog n balls. Azar, Broder, Karlin, and Upfal [1] showed that instead of choosing one bin, if d ≥ 2 bins are chosen at random and the ball inserted into the least loaded of the d bins, the maximum load reduces drastically to log log n / log d+O(1). In this paper, we study the two choice balls and bins process when balls are not allowed to choose any two random bins, but only bins that are connected by an edge in an underlying graph. We show that for n balls and n bins, if the graph is almost regular with degree n ǫ, where ǫ is not too small, the previous bounds on the maximum load continue to hold. Precisely, the maximum load is