Results 1  10
of
17
Logarithmic lower bounds in the cellprobe model
 SIAM Journal on Computing
"... Abstract. We develop a new technique for proving cellprobe lower bounds on dynamic data structures. This enables us to prove Ω(lg n) bounds, breaking a longstanding barrier of Ω(lg n/lg lg n). We can also prove the first Ω(lgB n) lower bound in the external memory model, without assumptions on the ..."
Abstract

Cited by 34 (4 self)
 Add to MetaCart
Abstract. We develop a new technique for proving cellprobe lower bounds on dynamic data structures. This enables us to prove Ω(lg n) bounds, breaking a longstanding barrier of Ω(lg n/lg lg n). We can also prove the first Ω(lgB n) lower bound in the external memory model, without assumptions on the data structure. We use our technique to prove better bounds for the partialsums problem, dynamic connectivity and (by reductions) other dynamic graph problems. Our proofs are surprisingly simple and clean. The bounds we obtain are often optimal, and lead to a nearly complete understanding of the problems. We also present new matching upper bounds for the partialsums problem. Key words. cellprobe complexity, lower bounds, data structures, dynamic graph problems, partialsums problem AMS subject classification. 68Q17
Rank and select revisited and extended
 Workshop on SpaceConscious Algorithms, University of
, 2006
"... The deep connection between the BurrowsWheeler transform (BWT) and the socalled rank and select data structures for symbol sequences is the basis of most successful approaches to compressed text indexing. Rank of a symbol at a given position equals the number of times the symbol appears in the corr ..."
Abstract

Cited by 33 (17 self)
 Add to MetaCart
The deep connection between the BurrowsWheeler transform (BWT) and the socalled rank and select data structures for symbol sequences is the basis of most successful approaches to compressed text indexing. Rank of a symbol at a given position equals the number of times the symbol appears in the corresponding prefix of the sequence. Select is the inverse, retrieving the positions of the symbol occurrences. It has been shown that improvements to rank/select algorithms, in combination with the BWT, turn into improved compressed text indexes. This paper is devoted to alternative implementations and extensions of rank and select data structures. First, we show that one can use gap encoding techniques to obtain constant time rank and select queries in essentially the same space as what is achieved by the best current direct solution (and sometimes less). Second, we extend symbol rank and select to substring rank and select, giving several space/time tradeoffs for the problem. An application of these queries is in positionrestricted substring searching, where one can specify the range in the text where the search is restricted to, and only occurrences residing in that range are to be reported. In addition, arbitrary occurrences are reported in text position order. Several byproducts of our results display connections with searchable partial sums, Chazelle’s twodimensional data structures, and Grossi et al.’s wavelet trees.
Fullyfunctional succinct trees
 In Proc. 21st SODA
, 2010
"... We propose new succinct representations of ordinal trees, which have been studied extensively. It is known that any nnode static tree can be represented in 2n + o(n) bits and a large number of operations on the tree can be supported in constant time under the wordRAM model. However existing data s ..."
Abstract

Cited by 33 (12 self)
 Add to MetaCart
We propose new succinct representations of ordinal trees, which have been studied extensively. It is known that any nnode static tree can be represented in 2n + o(n) bits and a large number of operations on the tree can be supported in constant time under the wordRAM model. However existing data structures are not satisfactory in both theory and practice because (1) the lowerorder term is Ω(nlog log n / log n), which cannot be neglected in practice, (2) the hidden constant is also large, (3) the data structures are complicated and difficult to implement, and (4) the techniques do not extend to dynamic trees supporting insertions and deletions of nodes. We propose a simple and flexible data structure, called the range minmax tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives, which are carried out in constant time on sufficiently small trees. The result is then extended to trees of arbitrary size, achieving 2n + O(n/polylog(n)) bits of space. The redundancy is significantly lower than in any previous proposal, and the data structure is easily implemented. Furthermore, using the same framework, we derive the first fullyfunctional dynamic succinct trees. 1
Practical rank/select queries over arbitrary sequences
 In Proc. 15th SPIRE, LNCS 5280
, 2008
"... Abstract. We present a practical study on the compact representation of sequences supporting rank, select, and access queries. While there are several theoretical solutions to the problem, only a few have been tried out, and there is little idea on how the others would perform, especially in the cas ..."
Abstract

Cited by 32 (23 self)
 Add to MetaCart
Abstract. We present a practical study on the compact representation of sequences supporting rank, select, and access queries. While there are several theoretical solutions to the problem, only a few have been tried out, and there is little idea on how the others would perform, especially in the case of sequences with very large alphabets. We first present a new practical implementation of the compressed representation for bit sequences proposed by Raman, Raman, and Rao [SODA 2002], that is competitive with the existing ones when the sequences are not too compressible. It also has nice local compression properties, and we show that this makes it an excellent tool for compressed text indexing in combination with the BurrowsWheeler transform. This shows the practicality of a recent theoretical proposal [Mäkinen and Navarro, SPIRE 2007], achieving spaces never seen before. Second, for general sequences, we tune wavelet trees for the case of very large alphabets, by removing their pointer information. We show that this gives an excellent solution for representing a sequence within zeroorder entropy space, in cases where the large alphabet poses a serious challenge to typical encoding methods. We also present the first implementation of Golynski et al.’s representation [SODA 2006], which offers another interesting time/space tradeoff. 1
Implicit compression boosting with applications to selfindexing
 In Proc. SPIRE'07, LNCS 4726
, 2007
"... Abstract. Compression boosting (Ferragina & Manzini, SODA 2004) is a new technique to enhance zeroth order entropy compressors ’ performance to kth order entropy. It works by constructing the BurrowsWheeler transform of the input text, finding optimal partitioning of the transform, and then compre ..."
Abstract

Cited by 29 (16 self)
 Add to MetaCart
Abstract. Compression boosting (Ferragina & Manzini, SODA 2004) is a new technique to enhance zeroth order entropy compressors ’ performance to kth order entropy. It works by constructing the BurrowsWheeler transform of the input text, finding optimal partitioning of the transform, and then compressing each piece using an arbitrary zeroth order compressor. The optimal partitioning has the property that the achieved compression is boosted to kth order entropy, for any k. The technique has an application to text indexing: Essentially, building a wavelet tree (Grossi et al., SODA 2003) for each piece in the partitioning yields a kth order compressed fulltext selfindex providing efficient substring searches on the indexed text (Ferragina et al., SPIRE 2004). In this paper, we show that using explicit compression boosting with wavelet trees is not necessary; our new analysis reveals that the size of the wavelet tree built for the complete BurrowsWheeler transformed text is, in essence, the sum of those built for the pieces in the optimal partitioning. Hence, the technique provides a way to do compression boosting implicitly, with a trivial linear time algorithm, but fixed to a specific zeroth order compressor (Raman et al., SODA 2002). In addition to having these consequences on compression and static fulltext selfindexes, the analysis shows that a recent dynamic zeroth order compressed selfindex (Mäkinen & Navarro, CPM 2006) occupies in fact space proportional to kth order entropy. 1
Fullyfunctional static and dynamic succinct trees
, 2010
"... We propose new succinct representations of ordinal trees, which have been studied extensively. It is known that any nnode static tree can be represented in 2n + o(n) bits and various operations on the tree can be supported in constant time under the wordRAM model. However the data structures are c ..."
Abstract

Cited by 18 (11 self)
 Add to MetaCart
We propose new succinct representations of ordinal trees, which have been studied extensively. It is known that any nnode static tree can be represented in 2n + o(n) bits and various operations on the tree can be supported in constant time under the wordRAM model. However the data structures are complicated and difficult to dynamize. We propose a simple and flexible data structure, called the range minmax tree, that reduces the large number of relevant tree operations considered in the literature, to a few primitives that are carried out in constant time on sufficiently small trees. The result is extended to trees of arbitrary size, achieving 2n + O(n/polylog(n)) bits of space. The redundancy is significantly lower than any previous proposal. For the dynamic case, where insertion/deletion of nodes is allowed, the existing data structures support very limited operations. Our data structure builds on the range minmax tree to achieve 2n + O(n / log n) bits of space and O(log n) time for all the operations. We also propose an improved data structure using 2n+O(n loglog n / logn) bits and improving the time to O(log n / loglog n) for most operations.
Lower bounds for dynamic connectivity
 STOC
, 2004
"... We prove an Ω(lg n) cellprobe lower bound on maintaining connectivity in dynamic graphs, as well as a more general tradeoff between updates and queries. Our bound holds even if the graph is formed by disjoint paths, and thus also applies to trees and plane graphs. The bound is known to be tight fo ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
We prove an Ω(lg n) cellprobe lower bound on maintaining connectivity in dynamic graphs, as well as a more general tradeoff between updates and queries. Our bound holds even if the graph is formed by disjoint paths, and thus also applies to trees and plane graphs. The bound is known to be tight for these restricted cases, proving optimality of these data structures (e.g., Sleator and Tarjan’s dynamic trees). Our tradeoff is known to be tight for trees, and the best two data structures for dynamic connectivity in general graphs are points on our tradeoff curve. In this sense these two data structures are optimal, and this tightness serves as strong evidence that our lower bounds are the best possible. From a more theoretical perspective, our result is the first logarithmic cellprobe lower bound for any problem in the natural class of dynamic language membership problems, breaking the long standing record of Ω(lg n / lg lg n). In this sense, our result is the first datastructure lower bound that is “truly ” logarithmic, i.e., logarithmic in the problem size counted in bits. Obtaining such a bound is listed as one of three major challenges for future research by Miltersen [13] (the other two challenges remain unsolved). Our techniques form a general framework for proving cellprobe lower bounds on dynamic data structures. We show how our framework also applies to the partialsums problem to obtain a nearly complete understanding of the problem in cellprobe and algebraic models, solving several previously posed open problems.
Balanced Parentheses Strike Back
"... An ordinal tree is an arbitrary rooted tree where the children of each node are ordered. Succinct representations for ordinal trees with efficient query support have been extensively studied. The best previously known result is due to Geary, Raman, and Raman [SODA 2004, pages 1–10]. The number of bi ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
An ordinal tree is an arbitrary rooted tree where the children of each node are ordered. Succinct representations for ordinal trees with efficient query support have been extensively studied. The best previously known result is due to Geary, Raman, and Raman [SODA 2004, pages 1–10]. The number of bits required by their representation for an nnode ordinal tree T is 2n + o(n), whose firstorder term is informationtheoretically optimal. Their representation supports a large set of O(1)time queries on T. Based upon a balanced string of 2n parentheses, we give an improved 2n + o(n)bit representation for T. Our improvement is two fold: Firstly, the set of O(1)time queries supported by our representation is a proper superset of that supported by the representation of Geary, Raman, and Raman. Secondly, it is also much easier for our representation to support new queries by simply adding new auxiliary strings.
A Framework for Dynamizing Succinct Data Structures ⋆
"... Abstract. We present a framework to dynamize succinct data structures, to encourage their use over nonsuccinct versions in a wide variety of important application areas. Our framework can dynamize most stateoftheart succinct data structures for dictionaries, ordinal trees, labeled trees, and text ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Abstract. We present a framework to dynamize succinct data structures, to encourage their use over nonsuccinct versions in a wide variety of important application areas. Our framework can dynamize most stateoftheart succinct data structures for dictionaries, ordinal trees, labeled trees, and text collections. Of particular note is its direct application to XML indexing structures that answer subpath queries [2]. Our framework focuses on achieving informationtheoretically optimal space along with nearoptimal update/query bounds. As the main part of our work, we consider the following problem central to text indexing: Given a text T over an alphabet Σ, construct a compressed data structure answering the queries char(i), rank s(i), and select s(i) for a symbol s ∈ Σ. Many data structures consider these queries for static text T [5, 3, 16, 4]. We build on these results and give the best known query bounds for the dynamic version of this problem, supporting arbitrary insertions and deletions of symbols in T. Specifically, with an amortized update time of O(n ɛ), any static succinct data structure D for T, taking t(n) time for queries, can be converted by our framework into a dynamic succinct data structure that supports rank s(i), select s(i), and char(i) queries in O(t(n) + log log n) time, for any constant ɛ> 0. When Σ  = polylog(n), we achieve O(1) query times. Our update/query bounds are nearoptimal with respect to the lower bounds from [13]. 1