Results 11  20
of
40
Dispersing Hash Functions
 In Proceedings of the 4th International Workshop on Randomization and Approximation Techniques in Computer Science (RANDOM ’00), volume 8 of Proceedings in Informatics
, 2000
"... A new hashing primitive is introduced: dispersing hash functions. A family of hash functions F is dispersing if, for any set S of a certain size and random h ∈ F, the expected value of S  − h[S]  is not much larger than the expectancy if h had been chosen at random from the set of all functions ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
A new hashing primitive is introduced: dispersing hash functions. A family of hash functions F is dispersing if, for any set S of a certain size and random h ∈ F, the expected value of S  − h[S]  is not much larger than the expectancy if h had been chosen at random from the set of all functions. We give tight, up to a logarithmic factor, upper and lower bounds on the size of dispersing families. Such families previously studied, for example universal families, are significantly larger than the smallest dispersing families, making them less suitable for derandomization. We present several applications of dispersing families to derandomization (fast element distinctness, set inclusion, and static dictionary initialization). Also, a tight relationship between dispersing families and extractors, which may be of independent interest, is exhibited. We also investigate the related issue of program size for hash functions which are nearly perfect. In particular, we exhibit a dramatic increase in program size for hash functions more dispersing than a random function. 1
Sketching and Streaming HighDimensional Vectors
, 2011
"... A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sk ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A sketch of a dataset is a smallspace data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sketch itself be computable by a smallspace algorithm given just one pass over the data, a socalled streaming algorithm. Sketching and streaming have found numerous applications in network traffic monitoring, data mining, trend detection, sensor networks, and databases. In this thesis, I describe several new contributions in the area of sketching and streaming algorithms. • The first spaceoptimal streaming algorithm for the distinct elements problem. Our algorithm also achieves O(1) update and reporting times. • A streaming algorithm for Hamming norm estimation in the turnstile model which achieves the best known space complexity.
A New Tradeoff for Deterministic Dictionaries
, 2000
"... . We consider dictionaries over the universe U = f0; 1g w on a unitcost RAM with word size w and a standard instruction set. We present a linear space deterministic dictionary with membership queries in time (log log n) O(1) and updates in time (log n) O(1) , where n is the size of the se ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
. We consider dictionaries over the universe U = f0; 1g w on a unitcost RAM with word size w and a standard instruction set. We present a linear space deterministic dictionary with membership queries in time (log log n) O(1) and updates in time (log n) O(1) , where n is the size of the set stored. This is the rst such data structure to simultaneously achieve query time (log n) o(1) and update time O(2 (log n) c ) for a constant c < 1. 1 Introduction Among the most fundamental data structures is the dictionary. A dictionary stores a subset S of a universe U , oering membership queries of the form \x 2 S?". The result of a membership query is either 'no' or a piece of satellite data associated with x. Updates of the set are supported via insertion and deletion of single elements. Several performance measures are of interest for dictionaries: The amount of space used, the time needed to answer queries, and the time needed to perform updates. The most ecient dictionar...
OneProbe Search
, 2002
"... We consider dictionaries that perform lookups by probing a single word of memory, knowing only the size of the data structure. We describe a randomized dictionary where a lookup returns the correct answer with probability 1 − ɛ, and otherwise returns “don’t know”. The lookup procedure uses an expa ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We consider dictionaries that perform lookups by probing a single word of memory, knowing only the size of the data structure. We describe a randomized dictionary where a lookup returns the correct answer with probability 1 − ɛ, and otherwise returns “don’t know”. The lookup procedure uses an expander graph to select the memory location to probe. Recent explicit expander constructions are shown to yield space usage far smaller than what would be required using a deterministic lookup procedure. Our data structure supports efficient deterministic updates, exhibiting new probabilistic guarantees on dictionary running time.
Lower Bound Techniques for Data Structures
, 2008
"... We describe new techniques for proving lower bounds on datastructure problems, with the following broad consequences:
â¢ the first Î©(lgn) lower bound for any dynamic problem, improving on a bound that had been standing since 1989;
â¢ for static data structures, the first separation between linea ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We describe new techniques for proving lower bounds on datastructure problems, with the following broad consequences:
â¢ the first Î©(lgn) lower bound for any dynamic problem, improving on a bound that had been standing since 1989;
â¢ for static data structures, the first separation between linear and polynomial space. Specifically, for some problems that have constant query time when polynomial space is allowed, we can show Î©(lg n/ lg lg n) bounds when the space is O(n Â· polylog n).
Using these techniques, we analyze a variety of central datastructure problems, and obtain improved lower bounds for the following:
â¢ the partialsums problem (a fundamental application of augmented binary search trees);
â¢ the predecessor problem (which is equivalent to IP lookup in Internet routers);
â¢ dynamic trees and dynamic connectivity;
â¢ orthogonal range stabbing;
â¢ orthogonal range counting, and orthogonal range reporting;
â¢ the partial match problem (searching with wildcards);
â¢ (1 + Îµ)approximate near neighbor on the hypercube;
â¢ approximate nearest neighbor in the lâ metric.
Our new techniques lead to surprisingly nontechnical proofs. For several problems, we obtain simpler proofs for bounds that were already known.
Fast Local Searches and Updates in Bounded Universes
"... Given a bounded universe {0, 1,..., U −1}, we show how to perform (successor) searches in O(log log ∆) expected time and updates in O(log log ∆) expected amortized time, where ∆ is the rank difference between the element being searched for and its successor in the structure. This unifies the results ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Given a bounded universe {0, 1,..., U −1}, we show how to perform (successor) searches in O(log log ∆) expected time and updates in O(log log ∆) expected amortized time, where ∆ is the rank difference between the element being searched for and its successor in the structure. This unifies the results of traditional bounded universe structures (which support successor searches in O(log log U) time) and hashing (which supports exact searches in O(1) time). We also show how these results can be extended to answer approximate nearest neighbour queries in low dimensions. 1
Fixed Universe Computational Geometry
"... We present a survey of algorithms and data structures that solve computational geometry problems under the assumption the input is drawn from a fixed universe. We show that many geometric problems can be solved faster in a fixed universe on the RAM. We overview the use of sorting, xfast tries and s ..."
Abstract
 Add to MetaCart
We present a survey of algorithms and data structures that solve computational geometry problems under the assumption the input is drawn from a fixed universe. We show that many geometric problems can be solved faster in a fixed universe on the RAM. We overview the use of sorting, xfast tries and stratified trees and combine these techniques to give efficient solutions to the orthogonal point location, orthogonal range query, and approximate nearest neighbor problems that beat their the algebraic decision tree lower bounds. 1
Optimal Hitting Sets for Combinatorial Shapes
, 2012
"... We consider the problem of constructing explicit Hitting sets for Combinatorial Shapes, a class of statistical tests first studied by Gopalan, Meka, Reingold, and Zuckerman (STOC 2011). These generalize many wellstudied classes of tests, including symmetric functions and combinatorial rectangles. G ..."
Abstract
 Add to MetaCart
We consider the problem of constructing explicit Hitting sets for Combinatorial Shapes, a class of statistical tests first studied by Gopalan, Meka, Reingold, and Zuckerman (STOC 2011). These generalize many wellstudied classes of tests, including symmetric functions and combinatorial rectangles. Generalizing results of Linial, Luby, Saks, and Zuckerman (Combinatorica 1997) and Rabani and Shpilka (SICOMP 2010), we construct hitting sets for Combinatorial Shapes of size polynomial in the alphabet, dimension, and the inverse of the error parameter. This is optimal up to polynomial factors. The best previous hitting sets came from the Pseudorandom Generator construction of Gopalan et al., and in particular had size that was quasipolynomial in the inverse of the error parameter. Our construction builds on natural variants of the constructions of Linial et al. and Rabani and Shpilka. In the process, we construct fractional perfect hash families and hitting sets for combinatorial rectangles with stronger guarantees. These might be of independent interest. 1
Theory and Practice of Monotone Minimal Perfect Hashing DJAMAL BELAZZOUGUI
"... Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, orderpreserving minimal perfect hash functions [12] have been used to retrieve the position of a key in a given list of keys: however, the ability to preserve any given orde ..."
Abstract
 Add to MetaCart
Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, orderpreserving minimal perfect hash functions [12] have been used to retrieve the position of a key in a given list of keys: however, the ability to preserve any given order leads to an unavoidable.n log n / lower bound on the number of bits required to store the function. Recently, it was observed [1] that very frequently the keys to be hashed are sorted in their intrinsic (i.e., lexicographical) order. This is typically the case of dictionaries of search engines, list of URLs of web graphs, etc. We refer to this restricted version of the problem as monotone minimal perfect hashing. We analyse experimentally the data structures proposed in [1], and along our way we propose some new methods that, albeit asymptotically equivalent or worse, perform very well in practice, and provide a balance between access speed, ease of construction, and space usage. 1