## An Optimal Algorithm for Generating Minimal Perfect Hash Functions (1992)

Venue: | Information Processing Letters |

Citations: | 45 - 1 self |

### BibTeX

@ARTICLE{Czech92anoptimal,

author = {Zbigniew J. Czech and George Havas and Bohdan S. Majewski},

title = {An Optimal Algorithm for Generating Minimal Perfect Hash Functions},

journal = {Information Processing Letters},

year = {1992},

volume = {43},

pages = {257--264}

}

### Years of Citing Articles

### OpenURL

### Abstract

A new algorithm for generating order preserving minimal perfect hash functions is presented. The algorithm is probabilistic, involving generation of random graphs. It uses expected linear time and requires a linear number words to represent the hash function, and thus is optimal up to constant factors. It runs very fast in practice. Keywords: Data structures, probabilistic algorithms, analysis of algorithms, hashing, random graphs

### Citations

2073 | On the evolution of random graphs
- Erdős, Rényi
- 1960
(Show Context)
Citation Context ...phs with no self-loops (k = 1) or multiple edges (k = 2), however it may be extended to cover them. Then, the probability of having an acyclic graph tends towards exp i \Gamma P n k=1 2 k =(2kc k ) j =-=[12]-=-. Since, for c ? 2, lim n!1 P n k=1 2 k =(2kc k ) = 1 2 ln i c c \Gamma2 j , the probability of getting an acyclic graph tends towards p 1 a = q c \Gamma2 c . For cs2, p 1 a = 0. Thus, for c ? 2 the p... |

1957 | Random Graphs - Bollobás - 2001 |

390 |
The Art of Computer Programming (vol 3): Sorting and Searching
- Knuth
- 1973
(Show Context)
Citation Context ...om the standard Unix dictionary all words shorter than 3 characters, longer than 18 characters or containing characters other than letters. For each experiment the words were selected using shuffling =-=[23]-=-. For m ? 24692, artificial sets of random words were generated. The values of m, iterations, mapping, assignment and total are the number of words, average number of iterations in the mapping step, t... |

217 |
Storing a sparse table with O(1) worst case access time
- Fredman, Komlós, et al.
- 1984
(Show Context)
Citation Context ...e recent independent developments appear in [13, 14, 16]. Various algorithms with different time complexities have been presented for constructing perfect or minimal perfect hash functions, including =-=[3, 4, 5, 6, 7, 8, 17, 10, 19, 20, 22, 30]-=-. In 1985 Sager proposed the mincycle algorithm [28] which uses graph considerations. The author claimed that the mincycle algorithm has complexity O(m 4 ). Based on this algorithm other solutions hav... |

167 |
Handbook of algorithms and data structures
- Gonnet, Baeza-Yates
- 1991
(Show Context)
Citation Context ...items from a static set, such as reserved words in programming languages, command names in operating systems, commonly used words in natural languages, etc. An overview of perfect hashing is given in =-=[18]-=-, x3.3.16 and the area is surveyed in [25]. Some recent independent developments appear in [13, 14, 16]. Various algorithms with different time complexities have been presented for constructing perfec... |

110 |
Leeuwen. Worst-case analysis of set union algorithms
- Tarjan, van
- 1984
(Show Context)
Citation Context ... 10 20 21 24 22 14 3 15 3 2 3 13 7 21 12 10 2 17 1 15 3 11 19 10 8 1 24 15 9 17 a) b) Figure 4: Contents of the mapping tables: a) during the first iteration; b) during the second iteration algorithm =-=[29]-=- to do so. This results in a theoretically inferior solution, as the best set union algorithms have worst-case complexity O(n +mff(n; n)), where ff(n; n) is the functional inverse of Ackermann's funct... |

41 |
Practical minimal perfect hash functions for large databases
- Fox, Heath, et al.
- 1992
(Show Context)
Citation Context ...ating systems, commonly used words in natural languages, etc. An overview of perfect hashing is given in [18], x3.3.16 and the area is surveyed in [25]. Some recent independent developments appear in =-=[13, 14, 16]-=-. Various algorithms with different time complexities have been presented for constructing perfect or minimal perfect hash functions, including [3, 4, 5, 6, 7, 8, 17, 10, 19, 20, 22, 30]. In 1985 Sage... |

40 |
auf der Heide, A new universal class of hash functions and dynamic hashing in real time
- Dietzfelbinger, Meyer
- 1990
(Show Context)
Citation Context ...e recent independent developments appear in [13, 14, 16]. Various algorithms with different time complexities have been presented for constructing perfect or minimal perfect hash functions, including =-=[3, 4, 5, 6, 7, 8, 17, 10, 19, 20, 22, 30]-=-. In 1985 Sager proposed the mincycle algorithm [28] which uses graph considerations. The author claimed that the mincycle algorithm has complexity O(m 4 ). Based on this algorithm other solutions hav... |

29 | Order-preserving minimal perfect hash functions and information retrieval
- Fox, Chen, et al.
(Show Context)
Citation Context ...ating systems, commonly used words in natural languages, etc. An overview of perfect hashing is given in [18], x3.3.16 and the area is surveyed in [25]. Some recent independent developments appear in =-=[13, 14, 16]-=-. Various algorithms with different time complexities have been presented for constructing perfect or minimal perfect hash functions, including [3, 4, 5, 6, 7, 8, 17, 10, 19, 20, 22, 30]. In 1985 Sage... |

29 |
The expected linearity of a simple equivalence algorithm
- Knuth, SchÄonhage
- 1978
(Show Context)
Citation Context ...ithms have worst-case complexity O(n +mff(n; n)), where ff(n; n) is the functional inverse of Ackermann's function. However, linear time performance of set union algorithms is expected on the average =-=[24, 2, 31], and, as -=-the authors of [29] point out "for all practical purposes, ff(m; n) is a constant no larger than four.") Because of the cycle, the mapping process has to be repeated. The contents of tables ... |

21 |
Minimal perfect hash functions made simple
- Cichelli
- 1980
(Show Context)
Citation Context ...e recent independent developments appear in [13, 14, 16]. Various algorithms with different time complexities have been presented for constructing perfect or minimal perfect hash functions, including =-=[3, 4, 5, 6, 7, 8, 17, 10, 19, 20, 22, 30]-=-. In 1985 Sager proposed the mincycle algorithm [28] which uses graph considerations. The author claimed that the mincycle algorithm has complexity O(m 4 ). Based on this algorithm other solutions hav... |

20 | A Faster Algorithm for Constructing Minimal Perfect Hash Functions
- Fox, Chen, et al.
- 1992
(Show Context)
Citation Context ...ating systems, commonly used words in natural languages, etc. An overview of perfect hashing is given in [18], x3.3.16 and the area is surveyed in [25]. Some recent independent developments appear in =-=[13, 14, 16]-=-. Various algorithms with different time complexities have been presented for constructing perfect or minimal perfect hash functions, including [3, 4, 5, 6, 7, 8, 17, 10, 19, 20, 22, 30]. In 1985 Sage... |

16 |
A polynomial time generator for minimal perfect hash functions
- Sager
(Show Context)
Citation Context ...t time complexities have been presented for constructing perfect or minimal perfect hash functions, including [3, 4, 5, 6, 7, 8, 17, 10, 19, 20, 22, 30]. In 1985 Sager proposed the mincycle algorithm =-=[28]-=- which uses graph considerations. The author claimed that the mincycle algorithm has complexity O(m 4 ). Based on this algorithm other solutions have been developed [9, 14, 15, 16], with mainly experi... |

12 |
A Versatile Data Structure For Edge-Oriented Graph Algorithms
- Ebert
- 1987
(Show Context)
Citation Context ...lgorithm took 763.07 seconds to generate a minimal perfect hash function for 524288 keys on a Sequent machine. In the implementation of the algorithm we used an edge-oriented representation of graphs =-=[11]-=-. This allowed us to handle edges as concrete objects, represented by integers, and not as pairs of vertices. Because of this, the space complexity of the algorithm is linear in the number of words to... |

8 |
Reciprocal hashing: A method for generating minimal perfect hashing functions
- Jaeschke
- 1981
(Show Context)
Citation Context |

8 |
Hashing for dynamic and static internal tables
- Lewis, Cook
- 1988
(Show Context)
Citation Context ...words in programming languages, command names in operating systems, commonly used words in natural languages, etc. An overview of perfect hashing is given in [18], x3.3.16 and the area is surveyed in =-=[25]-=-. Some recent independent developments appear in [13, 14, 16]. Various algorithms with different time complexities have been presented for constructing perfect or minimal perfect hash functions, inclu... |

7 |
The study of an ordered minimal perfect hashing scheme
- Chang
- 1984
(Show Context)
Citation Context |

5 |
Perfect hashing using sparse matrix packing
- Brain, Tharp
- 1990
(Show Context)
Citation Context |

5 |
An algebraic approach to Cichelli's perfect hashing
- Gori, Soda
- 1989
(Show Context)
Citation Context |

5 |
A family of generators of minimal perfect hash functions
- Majewski, Wormald, et al.
- 1992
(Show Context)
Citation Context ...k . To obtain a high probability of generating an acyclic graph in an iteration we must deal with very sparse graphs. We choose n = cm, for some constant c. Detailed probabilistic arguments appear in =-=[26]-=-. Briefly, they proceed as follows. For random labeled graphs with m edges and n = cm vertices as n ! 1, the expected number of cycles of length k tends towards 2 k =(2kc k ) [1, p. 98]. This result i... |

5 |
On the expected performance of path compression algorithms
- Yao
- 1985
(Show Context)
Citation Context ...ithms have worst-case complexity O(n +mff(n; n)), where ff(n; n) is the functional inverse of Ackermann's function. However, linear time performance of set union algorithms is expected on the average =-=[24, 2, 31], and, as -=-the authors of [29] point out "for all practical purposes, ff(m; n) is a constant no larger than four.") Because of the cycle, the mapping process has to be repeated. The contents of tables ... |

4 |
An interactive system for finding perfect hash functions
- Cercone, Boates, et al.
- 1985
(Show Context)
Citation Context |

4 |
A letter-oriented minimal perfect hashing scheme
- Chang, Lee
- 1986
(Show Context)
Citation Context |

4 |
Optimal algorithms for minimal perfect hashing
- Havas, Majewski
- 1992
(Show Context)
Citation Context ...1]. We show that the expected time complexity is O(m). The space required to store the generated function is O(m log m) bits, which is optimal for order preserving minimal perfect hash functions (see =-=[21]-=-). 2 The new algorithm Consider the following problem. For a given undirected graph G = (V ; E ), jE j = m, jV j = n find a function g : V ! [0; m \Gamma 1] such that the function h : E ! [0; m \Gamma... |

3 |
On the expected behaviour of disjoint set union algorithms
- Bollob'as, Simon
- 1985
(Show Context)
Citation Context ...ithms have worst-case complexity O(n +mff(n; n)), where ff(n; n) is the functional inverse of Ackermann's function. However, linear time performance of set union algorithms is expected on the average =-=[24, 2, 31], and, as -=-the authors of [29] point out "for all practical purposes, ff(m; n) is a constant no larger than four.") Because of the cycle, the mapping process has to be repeated. The contents of tables ... |

3 |
Generating a minimal perfect hashing function in O(m2) time
- Czech, Majewski
- 1992
(Show Context)
Citation Context ...roposed the mincycle algorithm [28] which uses graph considerations. The author claimed that the mincycle algorithm has complexity O(m 4 ). Based on this algorithm other solutions have been developed =-=[9, 14, 15, 16]-=-, with mainly experimental evidence of time performance. We present a new algorithm based on random graphs for finding minimal perfect hash functions of the form h(w) = i g(f 1 (w)) + g(f 2 (w)) j mod... |

3 |
An O(n log n) algorithm for finding minimal perfect hash functions
- Fox, Heath, et al.
- 1989
(Show Context)
Citation Context ...roposed the mincycle algorithm [28] which uses graph considerations. The author claimed that the mincycle algorithm has complexity O(m 4 ). Based on this algorithm other solutions have been developed =-=[9, 14, 15, 16]-=-, with mainly experimental evidence of time performance. We present a new algorithm based on random graphs for finding minimal perfect hash functions of the form h(w) = i g(f 1 (w)) + g(f 2 (w)) j mod... |

3 |
Finding minimal perfect hash functions
- Haggard, Karplus
- 1986
(Show Context)
Citation Context |

3 |
Minimal perfect hashing in polynomial time
- Winters
- 1990
(Show Context)
Citation Context |

2 |
Near-perfect hashing of large word sets’, Software—Practice and Experience
- Brain, Tharp
- 1989
(Show Context)
Citation Context |

2 |
A new method for generating minimal perfect hash functions
- Sager
- 1984
(Show Context)
Citation Context ...ed with the edge e = (u; w) is h(e), set g(w) to (h(e) \Gamma g(u)) mod m. Apply the above method to each component of G . Pseudocode is given in Fig. 2, which solves a problem like that addressed in =-=[27]-=-. (Notice that we have reversed our original problem, by defining the values of the function h first and then searching for suitable values for function g.) To prove the correctness of the method it i... |