## Optimal Dynamic Hash Tables

### BibTeX

@MISC{Kanizo_optimaldynamic,

author = {Yossi Kanizo and David Hay and Isaac Keslassy},

title = {Optimal Dynamic Hash Tables},

year = {}

}

### OpenURL

### Abstract

Abstract—Hash-based data structures, which use randomization in order to represent efficiently a list of elements, are one of the most-used data structures in networking applications, where both time and fast memory are scarce resources. This paper investigates the realistic scenario in which elements are not only added to the data structure but also deleted. We show that when the memory is bounded, dynamic hash-tables with deletions behave significantly worse than their static counterparts. This is contrast with previous results that show that when the memory is not bounded the two models behave practically the same. We provide tight upper and lower bounds on the achievable overflow fraction of the scheme under various models and system parameters. Then, we propose two architectures using CAMs and TCAMs that allow us to mitigate this decrease in performance. Our analytical results are confirmed using simulations with reallife traces and real hash-functions. A. Background I.

### Citations

617 |
Applied Probabilities and Queues
- Asmussen
(Show Context)
Citation Context ...ccupancy j to occupancy k is Q i ⎧ n−j ⎪⎨ m k = j + 1,k ≥ 1, jk = j · ⎪⎩ ( 1 − 1 ) m k = j − 1,k ≤ h − 1, − ( n−j m + j · ( 1 − 1 )) m k = j. { } i Xt is ergodic (e.g., from Corollary 2.5 in p. 74 of =-=[15]-=-), t≥0 and therefore its distribution converges to the stationary distribution πn . By ergodicity, the distribution of p(t) converges to πn as well. Furthermore, since the transition rates from any j ... |

601 | Queueing Systems
- Kleinrock
- 1975
(Show Context)
Citation Context ...as well; and therefore, the expected list converges to γn SIMPLE size of the overflow list is (γn SIMPLE · n). It is interesting to note that πn follows the well-known 1 1/m Engset distribution [13], =-=[14]-=-. In fact, using m−1 = 1−1/m and multiplying both the numerator and denominator of Equation (1) by ( 1 − 1 ) n, n m we can rewrite πk as ( )( ) n−k n 1 π n k = k h∑ l=0 ( n l m )( 1 m ) k ( 1 − 1 m ) ... |

208 | The power of two choices in randomized load balancing
- Mitzenmacher
(Show Context)
Citation Context ...crete and continuous finite models behaves like the fluid models. In addition, in simulations, we found that the scaled systems always converged fast to their fluid model, as later shown. We refer to =-=[12]-=- for a more complete discussion of the sufficient conditions for the convergence to the fluid-limit fixed-point solution. III. SIMPLE - A SINGLE-CHOICE HASHING SCHEME A. Scheme Description In order to... |

127 |
Introduction to queueing theory
- Cooper
- 1990
(Show Context)
Citation Context ...rflow as well; and therefore, the expected list converges to γn SIMPLE size of the overflow list is (γn SIMPLE · n). It is interesting to note that πn follows the well-known 1 1/m Engset distribution =-=[13]-=-, [14]. In fact, using m−1 = 1−1/m and multiplying both the numerator and denominator of Equation (1) by ( 1 − 1 ) n, n m we can rewrite πk as ( )( ) n−k n 1 π n k = k h∑ l=0 ( n l m )( 1 m ) k ( 1 − ... |

102 | The power of two random choices: a survey of techniques and results - Mitzenmacher, Richa, et al. |

84 | How Asymmetry Helps Load Balancing
- Vöcking
- 1999
(Show Context)
Citation Context ... 0.6 0.7 0.8 0.9 1 load Overflow fraction in a hash table based on d-random the asymmetric d-left algorithm, the static case and the dynamic case yield again the same bound on the maximum bucket size =-=[5]-=-, [6]. However, network designers should not assume that both systems always behave in the same way. In this paper, we actually show that with bounded bucket sizes, the dynamic system can behave signi... |

83 |
Expected length of the longest probe sequence in hash code searching
- Gonnet
- 1981
(Show Context)
Citation Context ...udy. For instance, in the static case in which n elements are uniformly hashed into n infinite buckets, the maximum bucket size is known to be approximately log n/log log n with high probability [2], =-=[3]-=-. The dynamic case yields the same result, assuming random elements depart and arrive while keeping n elements in the hash table. Likewise, when inserting each element in the least-loaded of two rando... |

60 | Content-Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey
- Pagiantzis, Sheikholeslami
- 2006
(Show Context)
Citation Context ...nt in the overflow γ for ω = 0.1, while the impact on a is limited. B. Moving Back Elements On Departures Let’s now introduce the EFFICIENT-DELETION scheme, which relies on using a TCAM (ternary CAM) =-=[20]-=- insteadγ 11 4 3.5 0.04 0.03 0.1 model simulation a 3 2.5 2 1.5 0 0.2 0.4 0.6 0.8 1 ω (a) Average number of memory accesses 0.02 0.01 0 0 0.2 0.4 0.6 0.8 1 ω (b) Overflow fraction Fig. 6. Performance... |

47 |
Multilevel adaptive hashing
- Broder, Karlin
- 1990
(Show Context)
Citation Context ... well as experiments with real hash functions applied on real-life traces. A. Terminology and Notations II. PROBLEM STATEMENT This paper considers multiple choice hash schemes with a stash [7], [10], =-=[11]-=-. Such schemes consist of two data structures: (i) A hash table of total memory size mh; partitioned to m buckets of size h; (ii) An overflow list, usually stored in an expensive content-addressable m... |

18 | The asymptotics of selecting the shortest of two, improved
- Mitzenmacher, Vocking
- 1999
(Show Context)
Citation Context ...0.7 0.8 0.9 1 load Overflow fraction in a hash table based on d-random the asymmetric d-left algorithm, the static case and the dynamic case yield again the same bound on the maximum bucket size [5], =-=[6]-=-. However, network designers should not assume that both systems always behave in the same way. In this paper, we actually show that with bounded bucket sizes, the dynamic system can behave significan... |

11 |
The Convexity of the Mean Queue Size of the M/M/C Queue with Respect to the Traffic Intensity
- Grassmann
- 1987
(Show Context)
Citation Context ...namic static H ( x ) 1 1 x1 H ( x ) 2 1 γ 0.5 1 2 3 4 5 6 7 8 9 10 11 12 0 0 1 2 3 4 5 a Fig. 4. Overflow fraction as a function of the average memory access rate a. f is known to be strictly concave =-=[17]-=-–[19] (the concavity also follows from Lemma 1). Therefore ∑ α i∈I i ( ) ( i β (a) ∑ · f r ≤ f ·α αi i∈I i · βi ) r αi ( = f r · ∑ β i ) (b) = f(r), where (a) uses concavity and ∑ i∈I αi = 1, and (b) ... |

10 |
Convexity properties of the Erlang loss formula
- Harel
- 1990
(Show Context)
Citation Context ... static H ( x ) 1 1 x1 H ( x ) 2 1 γ 0.5 1 2 3 4 5 6 7 8 9 10 11 12 0 0 1 2 3 4 5 a Fig. 4. Overflow fraction as a function of the average memory access rate a. f is known to be strictly concave [17]–=-=[19]-=- (the concavity also follows from Lemma 1). Therefore ∑ α i∈I i ( ) ( i β (a) ∑ · f r ≤ f ·α αi i∈I i · βi ) r αi ( = f r · ∑ β i ) (b) = f(r), where (a) uses concavity and ∑ i∈I αi = 1, and (b) uses ... |

9 | Hash-based techniques for high-speed packet processing
- Kirsch, Mitzenmacher, et al.
- 2009
(Show Context)
Citation Context ...cannot afford to rebuild the hash table to prevent collisions. However, such dynamic hash tables have been relatively little studied so far. First, because they are well-known for being hard to model =-=[1]-=-. Second, because most studies have found the same behavior in dynamic hash tables and static hash tables, thus making them less interesting to study. For instance, in the static case in which n eleme... |

9 |
Qualitative properties of the Erlang blocking model with heterogeneous user requirements. Queueing Syst
- Nain
- 1990
(Show Context)
Citation Context ...ributed hash function, as defined in Section II, and then the more general case with several hash functions using different subtable-based distributions. The proof relies on the following result from =-=[16]-=-. Consider an Erlang blocking model with N servers, and suppose that the arrival rate depends on the system. Let λk be the arrival rate when there are k transmissions in progress, k = 0,1,...,N−1. The... |

7 | Peacock hashing: Deterministic and updatable hashing for high performance networking - Kumar, Turner, et al. - 2008 |

6 | Optimal fast hashing
- Kanizo, Hay, et al.
- 2009
(Show Context)
Citation Context ...in disproportionate losses in the device. The network designer will need to provide a solution to this increased overflow rate. However, while there is a large literature on the static case [1], [2], =-=[7]-=-, there is little related work on modeling and reducing the overflow rate in the dynamic case with fixed-size buckets. One solution is to store clues in a higher-speed smaller memory [8]. Another poss... |

5 | The convexity of loss rate in an Erlang loss system and sojourn in an Erlang delay system with respect to arrival and service rates - Krishnan - 1990 |

3 | On the performance of multiple choice hash tables with moves on deletes and inserts
- Kirsch, Mitzenmacher
(Show Context)
Citation Context ... buckets. One solution is to store clues in a higher-speed smaller memory [8]. Another possibility is to move elements from one bucket to another, during either element arrivals or element departures =-=[9]-=-. In this paper, we will analyze the use of an overflow list, possibly implemented as a CAM, and likewise move elements back from the CAM during either element arrivals or element departures. B. Motiv... |