Results 1 
3 of
3
Lower Bound Techniques for Data Structures
, 2008
"... We describe new techniques for proving lower bounds on datastructure problems, with the following broad consequences:
â¢ the first Î©(lgn) lower bound for any dynamic problem, improving on a bound that had been standing since 1989;
â¢ for static data structures, the first separation between linea ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We describe new techniques for proving lower bounds on datastructure problems, with the following broad consequences:
â¢ the first Î©(lgn) lower bound for any dynamic problem, improving on a bound that had been standing since 1989;
â¢ for static data structures, the first separation between linear and polynomial space. Specifically, for some problems that have constant query time when polynomial space is allowed, we can show Î©(lg n/ lg lg n) bounds when the space is O(n Â· polylog n).
Using these techniques, we analyze a variety of central datastructure problems, and obtain improved lower bounds for the following:
â¢ the partialsums problem (a fundamental application of augmented binary search trees);
â¢ the predecessor problem (which is equivalent to IP lookup in Internet routers);
â¢ dynamic trees and dynamic connectivity;
â¢ orthogonal range stabbing;
â¢ orthogonal range counting, and orthogonal range reporting;
â¢ the partial match problem (searching with wildcards);
â¢ (1 + Îµ)approximate near neighbor on the hypercube;
â¢ approximate nearest neighbor in the lâ metric.
Our new techniques lead to surprisingly nontechnical proofs. For several problems, we obtain simpler proofs for bounds that were already known.
Succincter
"... We can represent an array of n values from {0, 1, 2} using ⌈n log 2 3 ⌉ bits (arithmetic coding), but then we cannot retrieve a single element efficiently. Instead, we can encode every block of t elements using ⌈t log 2 3 ⌉ bits, and bound the retrieval time by t. This gives a linear tradeoff betwe ..."
Abstract
 Add to MetaCart
We can represent an array of n values from {0, 1, 2} using ⌈n log 2 3 ⌉ bits (arithmetic coding), but then we cannot retrieve a single element efficiently. Instead, we can encode every block of t elements using ⌈t log 2 3 ⌉ bits, and bound the retrieval time by t. This gives a linear tradeoff between the redundancy of the representation and the query time. In fact, this type of linear tradeoff is ubiquitous in known succinct data structures, and in data compression. The folk wisdom is that if we want to waste one bit per block, the encoding is so constrained that it cannot help the query in any way. Thus, the only thing a query can do is to read the entire block and unpack it. We break this limitation and show how to use recursion to improve redundancy. It turns out that if a block is encoded with two (!) bits of redundancy, we can decode a single element, and answer many other interesting queries, in time logarithmic in the block size. Our technique allows us to revisit classic problems in succinct data structures, and give surprising new upper bounds. We also construct a locallydecodable version of arithmetic coding.
Google Research Award Proposal: Data Structures
"... Data structures are essential components of computer systems in general and Google in particular. We believe this area of research is in an auspicious position where practical and theoretical goals are well aligned, implying that deep algorithmic ideas can also have significant practical impact. We ..."
Abstract
 Add to MetaCart
Data structures are essential components of computer systems in general and Google in particular. We believe this area of research is in an auspicious position where practical and theoretical goals are well aligned, implying that deep algorithmic ideas can also have significant practical impact. We exemplify with a few examples from our past research, which address problems of universal value, and should have important applications in real systems. Cacheoblivious Btrees: Btrees are a fundamental tool for representing large sets of data in external memory. But what is “external memory”? Modern computers have complicated memory hierarchies, including L1 cache, L2 cache, main memory, disk, and often network storage. Even if one decides to concentrate on one level of the hierarchy, choosing the optimal branching factor involves nontrivial tuning. A surprising, clean alternative is to design a Btree which works in the optimal O(log B n) time without knowing the memory block size B! Then the Btree will work optimally on all levels of the memory hierarchy simultaneously. Our initial paper [BDFC05] showing that this is possible has been very influential in the further study of cacheobliviousness. Bloomier filters: Suppose we want to represent a set S of items, and answer queries of the form