Results 11  20
of
27
Dispersing Hash Functions
 In Proceedings of the 4th International Workshop on Randomization and Approximation Techniques in Computer Science (RANDOM ’00), volume 8 of Proceedings in Informatics
, 2000
"... A new hashing primitive is introduced: dispersing hash functions. A family of hash functions F is dispersing if, for any set S of a certain size and random h ∈ F, the expected value of S  − h[S]  is not much larger than the expectancy if h had been chosen at random from the set of all functions ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
A new hashing primitive is introduced: dispersing hash functions. A family of hash functions F is dispersing if, for any set S of a certain size and random h ∈ F, the expected value of S  − h[S]  is not much larger than the expectancy if h had been chosen at random from the set of all functions. We give tight, up to a logarithmic factor, upper and lower bounds on the size of dispersing families. Such families previously studied, for example universal families, are significantly larger than the smallest dispersing families, making them less suitable for derandomization. We present several applications of dispersing families to derandomization (fast element distinctness, set inclusion, and static dictionary initialization). Also, a tight relationship between dispersing families and extractors, which may be of independent interest, is exhibited. We also investigate the related issue of program size for hash functions which are nearly perfect. In particular, we exhibit a dramatic increase in program size for hash functions more dispersing than a random function. 1
A New Tradeoff for Deterministic Dictionaries
, 2000
"... . We consider dictionaries over the universe U = f0; 1g w on a unitcost RAM with word size w and a standard instruction set. We present a linear space deterministic dictionary with membership queries in time (log log n) O(1) and updates in time (log n) O(1) , where n is the size of the se ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
. We consider dictionaries over the universe U = f0; 1g w on a unitcost RAM with word size w and a standard instruction set. We present a linear space deterministic dictionary with membership queries in time (log log n) O(1) and updates in time (log n) O(1) , where n is the size of the set stored. This is the rst such data structure to simultaneously achieve query time (log n) o(1) and update time O(2 (log n) c ) for a constant c < 1. 1 Introduction Among the most fundamental data structures is the dictionary. A dictionary stores a subset S of a universe U , oering membership queries of the form \x 2 S?". The result of a membership query is either 'no' or a piece of satellite data associated with x. Updates of the set are supported via insertion and deletion of single elements. Several performance measures are of interest for dictionaries: The amount of space used, the time needed to answer queries, and the time needed to perform updates. The most ecient dictionar...
Lower Bound Techniques for Data Structures
, 2008
"... We describe new techniques for proving lower bounds on datastructure problems, with the following broad consequences:
â¢ the first Î©(lgn) lower bound for any dynamic problem, improving on a bound that had been standing since 1989;
â¢ for static data structures, the first separation between linea ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We describe new techniques for proving lower bounds on datastructure problems, with the following broad consequences:
â¢ the first Î©(lgn) lower bound for any dynamic problem, improving on a bound that had been standing since 1989;
â¢ for static data structures, the first separation between linear and polynomial space. Specifically, for some problems that have constant query time when polynomial space is allowed, we can show Î©(lg n/ lg lg n) bounds when the space is O(n Â· polylog n).
Using these techniques, we analyze a variety of central datastructure problems, and obtain improved lower bounds for the following:
â¢ the partialsums problem (a fundamental application of augmented binary search trees);
â¢ the predecessor problem (which is equivalent to IP lookup in Internet routers);
â¢ dynamic trees and dynamic connectivity;
â¢ orthogonal range stabbing;
â¢ orthogonal range counting, and orthogonal range reporting;
â¢ the partial match problem (searching with wildcards);
â¢ (1 + Îµ)approximate near neighbor on the hypercube;
â¢ approximate nearest neighbor in the lâ metric.
Our new techniques lead to surprisingly nontechnical proofs. For several problems, we obtain simpler proofs for bounds that were already known.
/ pagh/papers/ OneProbe Search
"... Abstract. We consider dictionaries that perform lookups by probing a single word of memory, knowing only the size of the data structure. We describe a randomized dictionary where a lookup returns the correct answer with probability 1 − ɛ, and otherwise returns “don’t know”. The lookup procedure uses ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. We consider dictionaries that perform lookups by probing a single word of memory, knowing only the size of the data structure. We describe a randomized dictionary where a lookup returns the correct answer with probability 1 − ɛ, and otherwise returns “don’t know”. The lookup procedure uses an expander graph to select the memory location to probe. Recent explicit expander constructions are shown to yield space usage far smaller than what would be required using a deterministic lookup procedure. Our data structure supports efficient deterministic updates, exhibiting new probabilistic guarantees on dictionary running time. 1
Sketching and Streaming HighDimensional Vectors
, 2011
"... A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sk ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sketch itself be computable by a smallspace algorithm given just one pass over the data, a socalled streaming algorithm. Sketching and streaming have found numerous applications in network traffic monitoring, data mining, trend detection, sensor networks, and databases. In this thesis, I describe several new contributions in the area of sketching and streaming algorithms. • The first spaceoptimal streaming algorithm for the distinct elements problem. Our algorithm also achieves O(1) update and reporting times. • A streaming algorithm for Hamming norm estimation in the turnstile model which achieves the best known space complexity.
Fast Local Searches and Updates in Bounded Universes
"... Given a bounded universe {0, 1,..., U −1}, we show how to perform (successor) searches in O(log log ∆) expected time and updates in O(log log ∆) expected amortized time, where ∆ is the rank difference between the element being searched for and its successor in the structure. This unifies the results ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Given a bounded universe {0, 1,..., U −1}, we show how to perform (successor) searches in O(log log ∆) expected time and updates in O(log log ∆) expected amortized time, where ∆ is the rank difference between the element being searched for and its successor in the structure. This unifies the results of traditional bounded universe structures (which support successor searches in O(log log U) time) and hashing (which supports exact searches in O(1) time). We also show how these results can be extended to answer approximate nearest neighbour queries in low dimensions. 1
EFFICIENT UNIFORM GRIDS FOR COLLISION HANDLING IN MEDICAL SIMULATORS
"... spatial data structures; collision detection; collision response; cell indexing; spatial hashing; cuckoo hashing; We investigate spatial acceleration structures within collision handling in scenarios with ”worstcase ” spatial layout. These are scenarios where lots of collisions and interactions per ..."
Abstract
 Add to MetaCart
spatial data structures; collision detection; collision response; cell indexing; spatial hashing; cuckoo hashing; We investigate spatial acceleration structures within collision handling in scenarios with ”worstcase ” spatial layout. These are scenarios where lots of collisions and interactions persist over large time intervals. We focus on acceleration structures based on uniform grids and assess their efficiency in construction, update and query. Zcurves as a technique for the mapping of spatial locality to uniform grids are analyzed to improve the cachehit rate. Approximate solutions based on the grid representation are considered and discussed in the context of timecritical collision handling. The findings are applied to a deformable collision framework. Experiments are performed on scenarios that are typical for medical simulators. They often exhibit the ”worst case ” spatial layout mentioned above. 1
Succincter
"... We can represent an array of n values from {0, 1, 2} using ⌈n log 2 3 ⌉ bits (arithmetic coding), but then we cannot retrieve a single element efficiently. Instead, we can encode every block of t elements using ⌈t log 2 3 ⌉ bits, and bound the retrieval time by t. This gives a linear tradeoff betwe ..."
Abstract
 Add to MetaCart
We can represent an array of n values from {0, 1, 2} using ⌈n log 2 3 ⌉ bits (arithmetic coding), but then we cannot retrieve a single element efficiently. Instead, we can encode every block of t elements using ⌈t log 2 3 ⌉ bits, and bound the retrieval time by t. This gives a linear tradeoff between the redundancy of the representation and the query time. In fact, this type of linear tradeoff is ubiquitous in known succinct data structures, and in data compression. The folk wisdom is that if we want to waste one bit per block, the encoding is so constrained that it cannot help the query in any way. Thus, the only thing a query can do is to read the entire block and unpack it. We break this limitation and show how to use recursion to improve redundancy. It turns out that if a block is encoded with two (!) bits of redundancy, we can decode a single element, and answer many other interesting queries, in time logarithmic in the block size. Our technique allows us to revisit classic problems in succinct data structures, and give surprising new upper bounds. We also construct a locallydecodable version of arithmetic coding.
Computational Geometry through the Information Lens
, 2007
"... revisits classic problems in computational geometry from the modern algorithmic ..."
Abstract
 Add to MetaCart
revisits classic problems in computational geometry from the modern algorithmic
Figure 1: Diagram of a Perfect Static Hash Table Analysis
, 2009
"... Fredman, Komlós & Szemerédi [1] discovered a way to construct hash functions which are guaranteed to give no collisions for a set of input values (keys) which are defined at the time that the hash function is constructed. These hash functions can be evaluated in O(1) time and take O(n) space to stor ..."
Abstract
 Add to MetaCart
Fredman, Komlós & Szemerédi [1] discovered a way to construct hash functions which are guaranteed to give no collisions for a set of input values (keys) which are defined at the time that the hash function is constructed. These hash functions can be evaluated in O(1) time and take O(n) space to store. This is called ‘static ’ hashing, as it relies on knowing the values to be hashed ahead of time. Perfect static hashing has limited uses due to the requirement of knowing all the keys in advance, but it is occasionally used for things like recognising keywords in programming languages, or recognising command line arguments. Construction: 1. Create a hash function which produces ≤ n collisions on the given keys. We can find such a hash function using only a weakly universal hash function family in a constant number of trials. 2. Take each bin bi containing multiple items (i.e., bins with collisions) and generate another hash function which gives no collisions for the items in that bin. These second level hash tables are given ni 2 space, where ni is the number of items in bin bi. This ni 2 space means that it is possible to ensure that there will be no collisions in the second level tables in another constant number of trials.