Results 11  20
of
105
Faster Deterministic Sorting and Searching in Linear Space
, 1995
"... We present a significant improvement on linear space deterministic sorting and searching. On a unitcost RAM with word size w, an ordered set of n wbit keys (viewed as binary strings or integers) can be maintained in O ` min ` p log n; log n log w + log log n; log w log log n " time per op ..."
Abstract

Cited by 37 (7 self)
 Add to MetaCart
We present a significant improvement on linear space deterministic sorting and searching. On a unitcost RAM with word size w, an ordered set of n wbit keys (viewed as binary strings or integers) can be maintained in O ` min ` p log n; log n log w + log log n; log w log log n " time per operation, including insert, delete, member search, and neighbour search. The cost for searching is worstcase while the cost for updates is amortized. For range queries, there is an additional cost of reporting the found keys. As an application, n keys can be sorted in linear space at a worstcase cost of O \Gamma n p log n \Delta . The best previous method for deterministic sorting and searching in linear space has been the fusion trees which supports queries in O(logn= log log n) amortized time and sorting in O(n log n= log log n) worstcase time. We also make two minor observations on adapting our data structure to the input distribution and on the complexity of perfect hashing. 1 I...
Timespace tradeoffs for predecessor search
 In Proc. 38th ACM Sympos. Theory Comput
, 2006
"... We develop a new technique for proving cellprobe lower bounds for static data structures. Previous lower bounds used a reduction to communication games, which was known not to be tight by counting arguments. We give the first lower bound for an explicit problem which breaks this communication compl ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
We develop a new technique for proving cellprobe lower bounds for static data structures. Previous lower bounds used a reduction to communication games, which was known not to be tight by counting arguments. We give the first lower bound for an explicit problem which breaks this communication complexity barrier. In addition, our bounds give the first separation between polynomial and near linear space. Such a separation is inherently impossible by communication complexity. Using our lower bound technique and new upper bound constructions, we obtain tight bounds for searching predecessors among a static set of integers. Given a set Y of n integers of ℓ bits each, the goal is to efficiently find predecessor(x) = max {y ∈ Y  y ≤ x}. For this purpose, we represent Y on a RAM with word length w using S words of space. Defining a = lg S n +lg w, we show that the optimal search time is, up to constant factors: logw n lg min ℓ−lg n
Logarithmic lower bounds in the cellprobe model
 SIAM Journal on Computing
"... Abstract. We develop a new technique for proving cellprobe lower bounds on dynamic data structures. This enables us to prove Ω(lg n) bounds, breaking a longstanding barrier of Ω(lg n/lg lg n). We can also prove the first Ω(lgB n) lower bound in the external memory model, without assumptions on the ..."
Abstract

Cited by 34 (4 self)
 Add to MetaCart
Abstract. We develop a new technique for proving cellprobe lower bounds on dynamic data structures. This enables us to prove Ω(lg n) bounds, breaking a longstanding barrier of Ω(lg n/lg lg n). We can also prove the first Ω(lgB n) lower bound in the external memory model, without assumptions on the data structure. We use our technique to prove better bounds for the partialsums problem, dynamic connectivity and (by reductions) other dynamic graph problems. Our proofs are surprisingly simple and clean. The bounds we obtain are often optimal, and lead to a nearly complete understanding of the problems. We also present new matching upper bounds for the partialsums problem. Key words. cellprobe complexity, lower bounds, data structures, dynamic graph problems, partialsums problem AMS subject classification. 68Q17
Cacheoblivious algorithms and data structures
 IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cacheoblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cacheoblivious algorithms and data structures that perform as well or nearly as well as standard externalmemory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
Fullyfunctional succinct trees
 In Proc. 21st SODA
, 2010
"... We propose new succinct representations of ordinal trees, which have been studied extensively. It is known that any nnode static tree can be represented in 2n + o(n) bits and a large number of operations on the tree can be supported in constant time under the wordRAM model. However existing data s ..."
Abstract

Cited by 33 (13 self)
 Add to MetaCart
We propose new succinct representations of ordinal trees, which have been studied extensively. It is known that any nnode static tree can be represented in 2n + o(n) bits and a large number of operations on the tree can be supported in constant time under the wordRAM model. However existing data structures are not satisfactory in both theory and practice because (1) the lowerorder term is Ω(nlog log n / log n), which cannot be neglected in practice, (2) the hidden constant is also large, (3) the data structures are complicated and difficult to implement, and (4) the techniques do not extend to dynamic trees supporting insertions and deletions of nodes. We propose a simple and flexible data structure, called the range minmax tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives, which are carried out in constant time on sufficiently small trees. The result is then extended to trees of arbitrary size, achieving 2n + O(n/polylog(n)) bits of space. The redundancy is significantly lower than in any previous proposal, and the data structure is easily implemented. Furthermore, using the same framework, we derive the first fullyfunctional dynamic succinct trees. 1
Suffix Trees on Words
, 1995
"... We present an intrinsic generalization on the suffix tree, designed to index a string of length n which has a natural partitioning into m multicharacter substrings or words. The word suffix tree represents only the m suffixes that start at word boundaries. These boundaries are determined by delimit ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
We present an intrinsic generalization on the suffix tree, designed to index a string of length n which has a natural partitioning into m multicharacter substrings or words. The word suffix tree represents only the m suffixes that start at word boundaries. These boundaries are determined by delimiters, whose definition depends on the application. Since traditional suffix tree construction algorithms rely heavily on the fact that all suffixes are inserted, construction of a word suffix tree is nontrivial, in particular when only O(m) construction space is allowed. We solve this problem, presenting an algorithm with O(n) expected running time. In general, construction cost is \Omega(n) due to the need of scanning the entire input. In applications that require strict node ordering, an additional cost of sorting O(m') characters arises, where m' is the number of distinct words. This is a significant improvement over previous solutions. In some cases, when the alphabet is small, we may assume that the n characters in the input string occupy o(n) machine words. We show that this can allow a word suffix tree to be built in sublinear time.
SpaceEfficient and Fast Algorithms for Multidimensional Dominance Reporting and Counting
 PROCEEDINGS OF THE 15TH ISAAC, VOLUME 3341 OF LECTURE NOTES IN COMPUTER SCIENCE
, 2004
"... We present linearspace sublogarithmic algorithms for handling the 3dimensional dominance reporting and the 2dimensional dominance counting problems. Under the RAM model as described in [M. L. Fredman and D. E. Willard. “Surpassing the information theoretic bound with fusion trees”, Journal of C ..."
Abstract

Cited by 29 (1 self)
 Add to MetaCart
We present linearspace sublogarithmic algorithms for handling the 3dimensional dominance reporting and the 2dimensional dominance counting problems. Under the RAM model as described in [M. L. Fredman and D. E. Willard. “Surpassing the information theoretic bound with fusion trees”, Journal of Computer and System Sciences, 47:424– 436, 1993], our algorithms achieve O(log n / log log n + f) query time for the 3dimensional dominance reporting problem, where f is the output size, and O(log n / log log n) query time for the 2dimensional dominance counting problem. We extend these results to any constant dimension d ≥ 3, achieving O(n(log n / log log n) d−3) space and O((log n / log log n) d−2 + f) query time for the reporting case and O(n(log n / log log n) d−2) space and O((log n / log log n) d−1) query time for the counting case.
Cell probe complexity  a survey
 In 19th Conference on the Foundations of Software Technology and Theoretical Computer Science (FSTTCS), 1999. Advances in Data Structures Workshop
"... The cell probe model is a general, combinatorial model of data structures. We give a survey of known results about the cell probe complexity of static and dynamic data structure problems, with an emphasis on techniques for proving lower bounds. 1 ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
The cell probe model is a general, combinatorial model of data structures. We give a survey of known results about the cell probe complexity of static and dynamic data structure problems, with an emphasis on techniques for proving lower bounds. 1
CacheOblivious String Btrees
 IN: PROC. OF PRINCIPLES OF DATABASE SYSTEMS
, 2006
"... Btrees are the data structure of choice for maintaining searchable data on disk. However, Btrees perform suboptimally • when keys are long or of variable length, • when keys are compressed, even when using front compression, the standard Btree compression scheme, • for range queries, and • with r ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
Btrees are the data structure of choice for maintaining searchable data on disk. However, Btrees perform suboptimally • when keys are long or of variable length, • when keys are compressed, even when using front compression, the standard Btree compression scheme, • for range queries, and • with respect to memory effects such as disk prefetching. This paper presents a cacheoblivious string Btree (COSBtree) data structure that is efficient in all these ways: • The COSBtree searches asymptotically optimally and inserts and deletes nearly optimally. • It maintains an index whose size is proportional to the frontcompressed size of the dictionary. Furthermore, unlike standard frontcompressed strings, keys can be decompressed in a memoryefficient manner. • It performs range queries with no extra disk seeks; in contrast, Btrees incur disk seeks when skipping from leaf block to leaf block. • It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cacheoblivious layout strategies.
An Optimal Algorithm for the Distinct Elements Problem
"... We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing, ne ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet and Martin in their seminal paper in FOCS 1983. This problem has applications to query optimization, Internet routing, network topology, and data mining. For a stream of indices in {1,..., n}, our algorithm computes a (1 ± ε)approximation using an optimal O(ε −2 +log(n)) bits of space with 2/3 success probability, where 0 < ε < 1 is given. This probability can be amplified by independent repetition. Furthermore, our algorithm processes each stream update in O(1) worstcase time, and can report an estimate at any point midstream in O(1) worstcase time, thus settling both the space and time complexities simultaneously.