Results 1 
9 of
9
Succinct Trees in Practice
"... We implement and compare the major current techniques for representing general trees in succinct form. This is important because a general tree of n nodes is usually represented in pointer form, requiring O(n log n) bits, whereas the succinct representations we study require just 2n + o(n) bits and ..."
Abstract

Cited by 37 (20 self)
 Add to MetaCart
(Show Context)
We implement and compare the major current techniques for representing general trees in succinct form. This is important because a general tree of n nodes is usually represented in pointer form, requiring O(n log n) bits, whereas the succinct representations we study require just 2n + o(n) bits and carry out many sophisticated operations in constant time. Yet, there is no exhaustive study in the literature comparing the practical magnitudes of the o(n)space and the O(1)time terms. The techniques can be classified into three broad trends: those based on BP (balanced parentheses in preorder), those based on DFUDS (depthfirst unary degree sequence), and those based on LOUDS (levelordered unary degree sequence). BP and DFUDS require a balanced parentheses representation that supports the core operations
New algorithms on wavelet trees and applications to information retrieval
 Theoretical Computer Science
, 2012
"... ar ..."
(Show Context)
Grammar Compressed Sequences with Rank/Select Support?
"... Abstract. Sequence representations supporting not only direct access to their symbols, but also rank/select operations, are a fundamental building block in many compressed data structures. In several recent applications, the need to represent highly repetitive sequences arises, where statistical co ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Sequence representations supporting not only direct access to their symbols, but also rank/select operations, are a fundamental building block in many compressed data structures. In several recent applications, the need to represent highly repetitive sequences arises, where statistical compression is ineffective. We introduce grammarbased representations for repetitive sequences, which use up to 10 % of the space needed by representations based on statistical compression, and support direct access and rank/select operations within tens of microseconds. 1
The wavelet matrix: An efficient wavelet tree for large alphabets
 Information Systems
"... The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the wavelet matrix, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zeroorder entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map. 1
Compaction and Compression General Terms: Algorithms
"... Sequence representations supporting the queries access, select, and rank are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article, we prove a s ..."
Abstract
 Add to MetaCart
Sequence representations supporting the queries access, select, and rank are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article, we prove a strong lower bound for rank, which holds for rather permissive assumptions on the space used, and give matching upper bounds that require only a compressed representation of the sequence. Within this compressed space, the operations access and select can be solved in constant or almostconstant time, which is optimal for large alphabets. Our new upper bounds dominate all of the previous work in the time/space map.
New Algorithms on Wavelet Trees and Applications to Information Retrieval
, 2011
"... Wavelet trees are widely used in the representation of sequences, permutations, text collections, binary relations, discrete points, and other succinct data structures. We show, however, that this still falls short of exploiting all of the virtues of this versatile data structure. In particular we s ..."
Abstract
 Add to MetaCart
(Show Context)
Wavelet trees are widely used in the representation of sequences, permutations, text collections, binary relations, discrete points, and other succinct data structures. We show, however, that this still falls short of exploiting all of the virtues of this versatile data structure. In particular we show how to use wavelet trees to solve fundamental algorithmic problems such as range quantile queries, range next value queries, and range intersection queries. We explore several applications of these queries in Information Retrieval, in particular document retrieval in hierarchical and temporal documents, and in the representation of inverted lists.
Efficient Compressed Wavelet Trees over Large Alphabets
, 2014
"... The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space ..."
Abstract
 Add to MetaCart
(Show Context)
The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the wavelet matrix, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zeroorder entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map.
Optimal Lower and Upper Bounds for Representing Sequences
"... Sequence representations supporting queries access, select and rank are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article we prove a strong ..."
Abstract
 Add to MetaCart
Sequence representations supporting queries access, select and rank are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article we prove a strong lower bound for rank, which holds for rather permissive assumptions on the space used, and give matching upper bounds that require only a compressed representation of the sequence. Within this compressed space, operations access and select can be solved in constant or almostconstant time, which is optimal for large alphabets. Our new upper bounds dominate all of the previous work in the time/space map.