Results 1  10
of
13
Cuckoo hashing
 Journal of Algorithms
, 2001
"... We present a simple dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. (Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738–761, 1994). The space usage is similar to that ..."
Abstract

Cited by 124 (6 self)
 Add to MetaCart
We present a simple dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. (Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738–761, 1994). The space usage is similar to that of binary search trees, i.e., three words per key on average. Besides being conceptually much simpler than previous dynamic dictionaries with worst case constant lookup time, our data structure is interesting in that it does not use perfect hashing, but rather a variant of open addressing where keys can be moved back in their probe sequences. An implementation inspired by our algorithm, but using weaker hash functions, is found to be quite practical. It is competitive with the best known dictionaries having an average case (but no nontrivial worst case) guarantee. Key Words: data structures, dictionaries, information retrieval, searching, hashing, experiments * Partially supported by the Future and Emerging Technologies programme of the EU
Space Efficient Hash Tables With Worst Case Constant Access Time
 In STACS
, 2003
"... We generalize Cuckoo Hashing [23] to dary Cuckoo Hashing and show how this yields a simple hash table data structure that stores n elements in (1 + ffl) n memory cells, for any constant ffl ? 0. Assuming uniform hashing, accessing or deleting table entries takes at most d = O(ln ffl ) probes ..."
Abstract

Cited by 47 (4 self)
 Add to MetaCart
We generalize Cuckoo Hashing [23] to dary Cuckoo Hashing and show how this yields a simple hash table data structure that stores n elements in (1 + ffl) n memory cells, for any constant ffl ? 0. Assuming uniform hashing, accessing or deleting table entries takes at most d = O(ln ffl ) probes and the expected amortized insertion time is constant. This is the first dictionary that has worst case constant access time and expected constant update time, works with (1 + ffl) n space, and supports satellite information. Experiments indicate that d = 4 choices suffice for ffl 0:03. We also describe variants of the data structure that allow the use of hash functions that can be evaluted in constant time.
On the (in)security of hashbased oblivious ram and a new balancing scheme
"... With the gaining popularity of remote storage (e.g. in the Cloud), we consider the setting where a small, protected local machine wishes to access data on a large, untrusted remote machine. This setting was introduced in the RAM model in the context of software protection by Goldreich and Ostrovsky. ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
With the gaining popularity of remote storage (e.g. in the Cloud), we consider the setting where a small, protected local machine wishes to access data on a large, untrusted remote machine. This setting was introduced in the RAM model in the context of software protection by Goldreich and Ostrovsky. A secure Oblivious RAM simulation allows for a client, with small (e.g., constant size) protected memory, to hide not only the data but also the sequence of locations it accesses (both reads and writes) in the unprotected memory of size n. Our main results are as follows: • We analyze several schemes from the literature, observing a repeated design flaw that leaks information on the memory access pattern. For some of these schemes, the leakage is actually nonnegligible, while for others it is negligible. • On the positive side, we present a new secure oblivious RAM scheme, extending a recent scheme by Goodrich and Mitzenmacher. Our scheme uses only O(1) local memory, and its (amortized) overhead is O(log 2 n / log log n), outperforming the previouslybest O(log 2 n) overhead (among schemes where the client only uses O(1) additional local memory).
Linear probing with constant independence
 In STOC ’07: Proceedings of the thirtyninth annual ACM symposium on Theory of computing
, 2007
"... Hashing with linear probing dates back to the 1950s, and is among the most studied algorithms. In recent years it has become one of the most important hash table organizations since it uses the cache of modern computers very well. Unfortunately, previous analyses rely either on complicated and space ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
Hashing with linear probing dates back to the 1950s, and is among the most studied algorithms. In recent years it has become one of the most important hash table organizations since it uses the cache of modern computers very well. Unfortunately, previous analyses rely either on complicated and space consuming hash functions, or on the unrealistic assumption of free access to a truly random hash function. Already Carter and Wegman, in their seminal paper on universal hashing, raised the question of extending their analysis to linear probing. However, we show in this paper that linear probing using a pairwise independent family may have expected logarithmic cost per operation. On the positive side, we show that 5wise independence is enough to ensure constant expected time per operation. This resolves the question of finding a space and time efficient hash function that provably ensures good performance for linear probing.
R.: Distributed oblivious ram for secure twoparty computation. Cryptology ePrint Archive
"... Secure twoparty computation protocol allows two players, Alice with secret input x and Bob with secret input y, to jointly execute an arbitrary program π(x, y) such that only the output of the program is revealed to one or both players. In the two player setting, under computational assumptions mos ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Secure twoparty computation protocol allows two players, Alice with secret input x and Bob with secret input y, to jointly execute an arbitrary program π(x, y) such that only the output of the program is revealed to one or both players. In the two player setting, under computational assumptions most approaches require to first “unroll ” π into a circuit, and then execute this circuit in a private manner. Without unrolling the circuit, an alternative method was proposed by Ostrovsky and Shoup (in STOC 1997) [28] with both players simulating the CPU of an oblivious RAM machine using “offtheshelf ” secure twoparty computation to perform CPU simulation with atomic instructions implemented by circuits and relying on one of the players to implement encrypted memory. The simulation of the CPU must be done through circuitbased secure twoparty computation, thus CPU “memory ” in the Oblivious RAM simulation must be minimized, as otherwise it impacts simulation of each step of the computation. Luckily, there are multiple Oblivious RAM solutions that require O(1) CPU memory in the security parameter. The OstrovskyShoup compiler suffers from two drawbacks: • The best running time of Oblivious RAM simulation with O(1) memory is still prohibitive and requires O(log 2 t / log log t) overhead for running programs of length t by Kushilevitz, Lu, and Ostrovsky (in
Notes about dcuckoo hashing
, 2006
"... We show an upper bound for the minimal size set of keys which cannot be inserted into a dcuckoo hashing table independently of the hashing functions and the insertion algorithm. We also discuss the probability for a successful insertion into such a hash table. 1 ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We show an upper bound for the minimal size set of keys which cannot be inserted into a dcuckoo hashing table independently of the hashing functions and the insertion algorithm. We also discuss the probability for a successful insertion into such a hash table. 1
Succincter
"... We can represent an array of n values from {0, 1, 2} using ⌈n log 2 3 ⌉ bits (arithmetic coding), but then we cannot retrieve a single element efficiently. Instead, we can encode every block of t elements using ⌈t log 2 3 ⌉ bits, and bound the retrieval time by t. This gives a linear tradeoff betwe ..."
Abstract
 Add to MetaCart
We can represent an array of n values from {0, 1, 2} using ⌈n log 2 3 ⌉ bits (arithmetic coding), but then we cannot retrieve a single element efficiently. Instead, we can encode every block of t elements using ⌈t log 2 3 ⌉ bits, and bound the retrieval time by t. This gives a linear tradeoff between the redundancy of the representation and the query time. In fact, this type of linear tradeoff is ubiquitous in known succinct data structures, and in data compression. The folk wisdom is that if we want to waste one bit per block, the encoding is so constrained that it cannot help the query in any way. Thus, the only thing a query can do is to read the entire block and unpack it. We break this limitation and show how to use recursion to improve redundancy. It turns out that if a block is encoded with two (!) bits of redundancy, we can decode a single element, and answer many other interesting queries, in time logarithmic in the block size. Our technique allows us to revisit classic problems in succinct data structures, and give surprising new upper bounds. We also construct a locallydecodable version of arithmetic coding.
the Federal Information Security Management Act [3], the
"... Regulatory frameworks impose a wide range of policies in finance, life sciences, healthcare and the government. ..."
Abstract
 Add to MetaCart
Regulatory frameworks impose a wide range of policies in finance, life sciences, healthcare and the government.
Language modeling for . . .
, 2009
"... With the increasing focus of speech recognition and natural language processing applications on domains with limited amount of indomain training data, enhanced system performance often relies on approaches involving model adaptation and combination. In such domains, language models are often constr ..."
Abstract
 Add to MetaCart
With the increasing focus of speech recognition and natural language processing applications on domains with limited amount of indomain training data, enhanced system performance often relies on approaches involving model adaptation and combination. In such domains, language models are often constructed by interpolating component models trained from partially matched corpora. Instead of simple linear interpolation, we introduce a generalized linear interpolation technique that computes contextdependent mixture weights from features that correlate with the component confidence and relevance for each ngram context. Since the ngrams from partially matched corpora may not be of equal relevance to the target domain, we propose an ngram weighting scheme to adjust the component ngram probabilities based on features derived from readily available corpus segmentation and metadata to deemphasize outofdomain ngrams. In scenarios without any matched data for a development set, we examine unsupervised and active learning techniques for tuning the interpolation and weighting parameters. Results on a lecture transcription task using the proposed generalized linear interpolation and ngram weighting techniques yield up to a 1.4 % absolute word error rate reduction