Results 1  10
of
26
PrivacyPreserving Access of Outsourced Data via Oblivious RAM Simulation. ArXiv eprints, April 2011. Eprint 1007.1259v2
"... Suppose a client, Alice, has outsourced her data to an external storage provider, Bob, because he has capacity for her massive data set, of size n, whereas her private storage is much smaller—say, of size O(n1/r), for some constant r> 1. Alice trusts Bob to maintain her data, but she would like t ..."
Abstract

Cited by 66 (9 self)
 Add to MetaCart
Suppose a client, Alice, has outsourced her data to an external storage provider, Bob, because he has capacity for her massive data set, of size n, whereas her private storage is much smaller—say, of size O(n1/r), for some constant r> 1. Alice trusts Bob to maintain her data, but she would like to keep its contents private. She can encrypt her data, of course, but she also wishes to keep her access patterns hidden from Bob as well. We describe schemes for the oblivious RAM simulation problem with a small logarithmic or polylogarithmic amortized increase in access times, with a very high probability of success, while keeping the external storage to be of size O(n). To achieve this, our algorithmic contributions include a parallel MapReduce cuckoohashing algorithm and an externalmemory dataoblivious sorting algorithm.
More Robust Hashing: Cuckoo Hashing with a Stash
 IN PROCEEDINGS OF THE 16TH ANNUAL EUROPEAN SYMPOSIUM ON ALGORITHMS (ESA
, 2008
"... Cuckoo hashing holds great potential as a highperformance hashing scheme for real applications. Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a failure occurs during the insertion of an item, r ..."
Abstract

Cited by 39 (14 self)
 Add to MetaCart
(Show Context)
Cuckoo hashing holds great potential as a highperformance hashing scheme for real applications. Up to this point, the greatest drawback of cuckoo hashing appears to be that there is a polynomially small but practically significant probability that a failure occurs during the insertion of an item, requiring an expensive rehashing of all items in the table. In this paper, we show that this failure probability can be dramatically reduced by the addition of a very small constantsized stash. We demonstrate both analytically and through simulations that stashes of size equivalent to only three or four items yield tremendous improvements, enhancing cuckoo hashing’s practical viability in both hardware and software. Our analysis naturally extends previous analyses of multiple cuckoo hashing variants, and the approach may prove useful in further related schemes.
Realtime parallel hashing on the gpu
 In ACM SIGGRAPH Asia 2009 papers, SIGGRAPH ’09
, 2009
"... Figure 1: Overview of our construction for a voxelized Lucy model, colored by mapping x, y, and z coordinates to red, green, and blue respectively (far left). The 3.5 million voxels (left) are input as 32bit keys and placed into buckets of ≤ 512 items, averaging 409 each (center). Each bucket then ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
Figure 1: Overview of our construction for a voxelized Lucy model, colored by mapping x, y, and z coordinates to red, green, and blue respectively (far left). The 3.5 million voxels (left) are input as 32bit keys and placed into buckets of ≤ 512 items, averaging 409 each (center). Each bucket then builds a cuckoo hash with three subtables and stores them in a larger structure with 5 million entries (right). Closeups follow the progress of a single bucket, showing the keys allocated to it (center; the bucket is linear and wraps around left to right) and each of its completed cuckoo subtables (right). Finding any key requires checking only three possible locations. We demonstrate an efficient dataparallel algorithm for building large hash tables of millions of elements in realtime. We consider two parallel algorithms for the construction: a classical sparse perfect hashing approach, and cuckoo hashing, which packs elements densely by allowing an element to be stored in one of multiple possible locations. Our construction is a hybrid approach that uses both algorithms. We measure the construction time, access time, and memory usage of our implementations and demonstrate realtime performance on large datasets: for 5 million keyvalue pairs, we construct a hash table in 35.7 ms using 1.42 times as much memory as the input data itself, and we can access all the elements in that hash table in 15.3 ms. For comparison, sorting the same data requires 36.6 ms, but accessing all the elements via binary search requires 79.5 ms. Furthermore, we show how our hashing methods can be applied to two graphics applications: 3D surface intersection for moving data and geometric hashing for image matching.
Efficient hashing with lookups in two memory accesses, in: 16th
 SODA, ACMSIAM
"... The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the c ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
The study of hashing is closely related to the analysis of balls and bins. Azar et. al. [1] showed that instead of using a single hash function if we randomly hash a ball into two bins and place it in the smaller of the two, then this dramatically lowers the maximum load on bins. This leads to the concept of twoway hashing where the largest bucket contains O(log log n) balls with high probability. The hash look up will now search in both the buckets an item hashes to. Since an item may be placed in one of two buckets, we could potentially move an item after it has been initially placed to reduce maximum load. Using this fact, we present a simple, practical hashing scheme that maintains a maximum load of 2, with high probability, while achieving high memory utilization. In fact, with n buckets, even if the space for two items are preallocated per bucket, as may be desirable in hardware implementations, more than n items can be stored giving a high memory utilization. Assuming truly random hash functions, we prove the following properties for our hashing scheme. • Each lookup takes two random memory accesses, and reads at most two items per access. • Each insert takes O(log n) time and up to log log n+ O(1) moves, with high probability, and constant time in expectation. • Maintains 83.75 % memory utilization, without requiring dynamic allocation during inserts. We also analyze the tradeoff between the number of moves performed during inserts and the maximum load on a bucket. By performing at most h moves, we can maintain a maximum load of O(hlogl((~og~og:n/h)). So, even by performing one move, we achieve a better bound than by performing no moves at all. 1
Balanced Allocation on Graphs
 In Proc. 7th Symposium on Discrete Algorithms (SODA
, 2006
"... It is well known that if n balls are inserted into n bins, with high probability, the bin with maximum load contains (1 + o(1))log n / loglog n balls. Azar, Broder, Karlin, and Upfal [1] showed that instead of choosing one bin, if d ≥ 2 bins are chosen at random and the ball inserted into the least ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
It is well known that if n balls are inserted into n bins, with high probability, the bin with maximum load contains (1 + o(1))log n / loglog n balls. Azar, Broder, Karlin, and Upfal [1] showed that instead of choosing one bin, if d ≥ 2 bins are chosen at random and the ball inserted into the least loaded of the d bins, the maximum load reduces drastically to log log n / log d+O(1). In this paper, we study the two choice balls and bins process when balls are not allowed to choose any two random bins, but only bins that are connected by an edge in an underlying graph. We show that for n balls and n bins, if the graph is almost regular with degree n ǫ, where ǫ is not too small, the previous bounds on the maximum load continue to hold. Precisely, the maximum load is
HashBased Techniques for HighSpeed Packet Processing
"... Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little bac ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
Abstract Hashing is an extremely useful technique for a variety of highspeed packetprocessing applications in routers. In this chapter, we survey much of the recent work in this area, paying particular attention to the interaction between theoretical and applied research. We assume very little background in either the theory or applications of hashing, reviewing the fundamentals as necessary. 1
Invertible Bloom Lookup Tables
 In Proceedings of the 49th Allerton Conference
, 2011
"... We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of keyvalue pairs, but also allows a complete listing of the pairs it contains with high probability, as long the number of keyvalue pairs is below a designed threshold. Our structure ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
(Show Context)
We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of keyvalue pairs, but also allows a complete listing of the pairs it contains with high probability, as long the number of keyvalue pairs is below a designed threshold. Our structure allows the number of keyvalue pairs to greatly exceed this threshold during normal operation. Exceeding the threshold simply temporarily prevents content listing and reduces the probability of a successful lookup. If entries are later deleted to return the structure below the threshold, everything again functions appropriately. We also show that simple variations of our structure are robust to certain standard errors, such as the deletion of a key without a corresponding insertion or the insertion of two distinct values for a key. The properties of our structure make it suitable for several applications, including database and networking applications that we highlight. 1
Maximum matchings in random bipartite graphs and the space utilization of cuckoo hashtables
, 2009
"... We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When consi ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
We study the the following question in Random Graphs. We are given two disjoint sets L, R with L  = n = αm and R  = m. We construct a random graph G by allowing each x ∈ L to choose d random neighbours in R. The question discussed is as to the size µ(G) of the largest matching in G. When considered in the context of Cuckoo Hashing, one key question is as to when is µ(G) = n whp? We answer this question exactly when d is at least three. We also establish a precise threshold for when Phase 1 of the KarpSipser Greedy matching algorithm suffices to compute a maximum matching whp.
An Analysis of RandomWalk Cuckoo Hashing
"... In this paper, we provide a polylogarithmic bound that holds with high probability on the insertion time for cuckoo hashing under the randomwalk insertion method. Cuckoo hashing provides a useful methodology for building practical, highperformance hash tables. The essential idea of cuckoo hashing ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
In this paper, we provide a polylogarithmic bound that holds with high probability on the insertion time for cuckoo hashing under the randomwalk insertion method. Cuckoo hashing provides a useful methodology for building practical, highperformance hash tables. The essential idea of cuckoo hashing is to combine the power of schemes that allow multiple hash locations for an item with the power to dynamically change the location of an item among its possible locations. Previous work on the case where the number of choices is larger than two has required a breadthfirst search analysis, which is both inefficient in practice and currently has only a polynomial high probability upper bound on the insertion time. Here we significantly advance the state of the art by proving a polylogarithmic bound on the more efficient randomwalk method, where items repeatedly kick out random blocking items until a free location for an item is found. 1
Bipartite Random Graphs and Cuckoo Hashing
 In Proceedings of the Fourth Colloquium on Mathematics and Computer Science
, 2006
"... The aim of this paper is to extend the analysis of Cuckoo Hashing of Devroye and Morin in 2003. In particular we make several asymptotic results much more precise. We show, that the probability that the construction of a hash table succeeds, is asymptotically 1 − c(ε)/m + O(1/m 2) for some explicit ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
The aim of this paper is to extend the analysis of Cuckoo Hashing of Devroye and Morin in 2003. In particular we make several asymptotic results much more precise. We show, that the probability that the construction of a hash table succeeds, is asymptotically 1 − c(ε)/m + O(1/m 2) for some explicit c(ε), where m denotes the size of each of the two tables, n = m(1 − ε) is the number of keys and ε ∈ (0, 1). The analysis rests on a generating function approach to the so called Cuckoo Graph, a random bipartite graph. We apply a double saddle point method to obtain asymptotic results covering tree sizes, the number of cycles and the probability that no complex component occurs.