Results 1  10
of
11
More Robust Hashing: Cuckoo Hashing with a Stash
 IN PROCEEDINGS OF THE 16TH ANNUAL EUROPEAN SYMPOSIUM ON ALGORITHMS (ESA
, 2008
"... Cuckoo hashing holds great potential as a highperformance hashing scheme for real applications. Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a failure occurs during the insertion of an item, r ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
(Show Context)
Cuckoo hashing holds great potential as a highperformance hashing scheme for real applications. Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a failure occurs during the insertion of an item, requiring an expensive rehashing of all items in the table. In this paper, we show that this failure probability can be dramatically reduced by the addition of a very small constantsized stash. We demonstrate both analytically and through simulations that stashes of size equivalent to only three or four items yield tremendous improvements, enhancing cuckoo hashing’s practical viability in both hardware and software. Our analysis naturally extends previous analyses of multiple cuckoo hashing variants, and the approach may prove useful in further related schemes.
Oblivious RAM Revisited
"... We reinvestigate the oblivious RAM concept introduced by Goldreich and Ostrovsky, which enables a client, that can store locally only a constant amount of data, to store remotely n data items, and access them while hiding the identities of the items which are being accessed. Oblivious RAM is often c ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
We reinvestigate the oblivious RAM concept introduced by Goldreich and Ostrovsky, which enables a client, that can store locally only a constant amount of data, to store remotely n data items, and access them while hiding the identities of the items which are being accessed. Oblivious RAM is often cited as a powerful tool, which can be used, for example, for search on encrypted data or for preventing cache attacks. However, oblivious RAM it is also commonly considered to be impractical due to its overhead, which is asymptotically efficient but is quite high: each data request is replaced by O(log 4 n) requests, or by O(log 3 n) requests where the constant in the “O ” notation is a few thousands. In addition, O(n log n) external memory is required in order to store the n data items. We redesign the oblivious RAM protocol using modern tools, namely Cuckoo hashing and a new oblivious sorting algorithm. The resulting protocol uses only O(n) external memory, and replaces each data request by only O(log 2 n) requests (with a small constant). This analysis is validated by experiments that we ran. Keywords: Secure twoparty computation, oblivious RAM.
Deamortized Cuckoo Hashing: Provable WorstCase Performance and Experimental Results
"... Cuckoo hashing is a highly practical dynamic dictionary: it provides amortized constant insertion time, worst case constant deletion time and lookup time, and good memory utilization. However, with a noticeable probability during the insertion of n elements some insertion requires Ω(log n) time. Whe ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Cuckoo hashing is a highly practical dynamic dictionary: it provides amortized constant insertion time, worst case constant deletion time and lookup time, and good memory utilization. However, with a noticeable probability during the insertion of n elements some insertion requires Ω(log n) time. Whereas such an amortized guarantee may be suitable for some applications, in other applications (such as highperformance routing) this is highly undesirable. Kirsch and Mitzenmacher (Allerton ’07) proposed a deamortization of cuckoo hashing using queueing techniques that preserve its attractive properties. They demonstrated a significant improvement to the worst case performance of cuckoo hashing via experimental results, but left open the problem of constructing a scheme with provable properties. In this work we present a deamortization of cuckoo hashing that provably guarantees constant worst case operations. Specifically, for any sequence of polynomially many operations, with overwhelming probability over the randomness of the initialization phase, each operation is performed in constant time. In addition, we present a general approach for proving that the performance guarantees are preserved when using hash functions with limited independence
HashBased Techniques for HighSpeed Packet Processing
"... Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little bac ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background in either the theory or applications of hashing, reviewing the fundamentals as necessary. 1
Maximum matchings in random bipartite graphs and the space utilization of cuckoo hashtables
, 2009
"... We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When consi ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When considered in the context of Cuckoo Hashing, one key question is as to when is µ(G) = n whp? We answer this question exactly when d is at least three. We also establish a precise threshold for when Phase 1 of the KarpSipser Greedy matching algorithm suffices to compute a maximum matching whp.
Backyard Cuckoo Hashing: Constant WorstCase Operations with a Succinct Representation
, 2010
"... The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constanttime operations in the worst case with high probability, and in terms of space consumption ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
The performance of a dynamic dictionary is measured mainly by its update time, lookup time, and space consumption. In terms of update time and lookup time there are known constructions that guarantee constanttime operations in the worst case with high probability, and in terms of space consumption there are known constructions that use essentially optimal space. In this paper we settle two fundamental open problems: • We construct the first dynamic dictionary that enjoys the best of both worlds: we present a twolevel variant of cuckoo hashing that stores n elements using (1+ϵ)n memory words, and guarantees constanttime operations in the worst case with high probability. Specifically, for any ϵ = Ω((log log n / log n) 1/2) and for any sequence of polynomially many operations, with high probability over the randomness of the initialization phase, all operations are performed in constant time which is independent of ϵ. The construction is based on augmenting cuckoo hashing with a “backyard ” that handles a large fraction of the elements, together with a deamortized perfect hashing scheme for eliminating the dependency on ϵ.
An Analysis of RandomWalk Cuckoo Hashing
"... In this paper, we provide a polylogarithmic bound that holds with high probability on the insertion time for cuckoo hashing under the randomwalk insertion method. Cuckoo hashing provides a useful methodology for building practical, highperformance hash tables. The essential idea of cuckoo hashing ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
In this paper, we provide a polylogarithmic bound that holds with high probability on the insertion time for cuckoo hashing under the randomwalk insertion method. Cuckoo hashing provides a useful methodology for building practical, highperformance hash tables. The essential idea of cuckoo hashing is to combine the power of schemes that allow multiple hash locations for an item with the power to dynamically change the location of an item among its possible locations. Previous work on the case where the number of choices is larger than two has required a breadthfirst search analysis, which is both inefficient in practice and currently has only a polynomial high probability upper bound on the insertion time. Here we significantly advance the state of the art by proving a polylogarithmic bound on the more efficient randomwalk method, where items repeatedly kick out random blocking items until a free location for an item is found. 1
Some Open Questions Related to Cuckoo Hashing
"... Abstract. The purpose of this brief note is to describe recent work in the area of cuckoo hashing, including a clear description of several open problems, with the hope of spurring further research. 1 ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The purpose of this brief note is to describe recent work in the area of cuckoo hashing, including a clear description of several open problems, with the hope of spurring further research. 1
HashBased Data Structures for Extreme Conditions
, 2008
"... This thesis is about the design and analysis of Bloom filter and multiple choice hash table variants for application settings with extreme resource requirements. We employ a very flexible methodology, combining theoretical, numerical, and empirical techniques to obtain constructions that are both an ..."
Abstract
 Add to MetaCart
This thesis is about the design and analysis of Bloom filter and multiple choice hash table variants for application settings with extreme resource requirements. We employ a very flexible methodology, combining theoretical, numerical, and empirical techniques to obtain constructions that are both analyzable and practical. First, we show that a wide class of Bloom filter variants can be effectively implemented using very easily computable combinations of only two fully random hash functions. From a theoretical perspective, these results show that Bloom filters and related data structures can often be substantially derandomized with essentially no loss in performance. From a practical perspective, this derandomization allows for a significant speedup in certain query intensive applications. The rest of this work focuses on designing spaceefficient, openaddressed, multiple choice hash tables for implementation in highperformance router hardware. Using multiple hash functions conserves space, but requires every hash table operation to consider multiple hash buckets, forcing a tradeoff between the slow speed of examining these buckets serially
Research Statement
"... Overview. My main research interest is the application of algorithms and data structures to realworld problems. My primary strengths are analyzing scenarios involving random processes, randomized algorithms, or other situations amenable to probabilistic thinking. Currently, the core of my work is o ..."
Abstract
 Add to MetaCart
(Show Context)
Overview. My main research interest is the application of algorithms and data structures to realworld problems. My primary strengths are analyzing scenarios involving random processes, randomized algorithms, or other situations amenable to probabilistic thinking. Currently, the core of my work is on the design and analysis of hashingbased data structures for highperformance router hardware, specifically Bloom filter and multiple choice hash table variants. While primarily theoretical in nature, this research has a very applied focus, and is actually directly supported by Cisco Systems. However, I am also drawn to more abstract problems with a substantial probabilistic component, including the stochastic analysis of longterm computer processes, decisionmaking under uncertainty, and information theory. Going forward, my research will continue in this vein, spanning the spectrum from largely theoretical results with a compelling motivation for future applications, to concrete algorithm engineering for current or emergent technologies. HashingBased Data Structures. Most of my research so far has been on the design and analysis of hashingbased data structures, particularly Bloom filter and multiple choice hash table variants, for specific application settings. (A Bloom filter is a very simple, spaceefficient data structure for set membership; see, e.g., [4].) All of this work is joint with my advisor, Michael Mitzenmacher. A large fraction of this research focuses on designing efficient, openaddressed hash tables for implementation