Results 1 - 10
of
10
Wavelet Trees for All
"... The wavelet tree is a versatile data structure that serves a number of purposes, from string processing to geometry. It can be regarded as a device that represents a sequence, a reordering, or a grid of points. In addition, its space adapts to various entropy measures of the data it encodes, enabli ..."
Abstract
-
Cited by 32 (12 self)
- Add to MetaCart
(Show Context)
The wavelet tree is a versatile data structure that serves a number of purposes, from string processing to geometry. It can be regarded as a device that represents a sequence, a reordering, or a grid of points. In addition, its space adapts to various entropy measures of the data it encodes, enabling compressed representations. New competitive solutions to a number of problems, based on wavelet trees, are appearing every year. In this survey we give an overview of wavelet trees and the surprising number of applications in which we have found them useful: basic and weighted point grids, sets of rectangles, strings, permutations, binary relations, graphs, inverted indexes, document retrieval indexes, full-text indexes, XML indexes, and general numeric sequences.
New algorithms on wavelet trees and applications to information retrieval
- Theoretical Computer Science
, 2012
"... ar ..."
(Show Context)
Space-Efficient Data-Analysis Queries on Grids
"... We consider various data-analysis queries on two-dimensional points. We give new space/time tradeoffs over previous work on semigroup and group queries such as sum, average, variance, minimum and maximum. We also introduce new solutions to queries rarely considered in the literature such as two-dime ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
We consider various data-analysis queries on two-dimensional points. We give new space/time tradeoffs over previous work on semigroup and group queries such as sum, average, variance, minimum and maximum. We also introduce new solutions to queries rarely considered in the literature such as two-dimensional quantiles, majorities, successor/predecessor and mode queries. We face static and dynamic scenarios.
The Wavelet Matrix
"... Abstract. The wavelet tree (Grossi et al., SODA 2003) is nowadays a popular succinct data structure for text indexes, discrete grids, and many other applications. When it has many nodes, a levelwise representation proposed by Mäkinen and Navarro (LATIN 2006) is preferable. We propose a different arr ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
(Show Context)
Abstract. The wavelet tree (Grossi et al., SODA 2003) is nowadays a popular succinct data structure for text indexes, discrete grids, and many other applications. When it has many nodes, a levelwise representation proposed by Mäkinen and Navarro (LATIN 2006) is preferable. We propose a different arrangement of the levelwise data, so that the bitmaps are shuffled in a different way. The result can no more be called a wavelet tree, and we dub it wavelet matrix. We demonstrate that the wavelet matrix is simpler to build, simpler to query, and faster in practice than the levelwise wavelet tree. This has a direct impact on many applications that use the levelwise wavelet tree for different purposes. 1
Space-Efficient Representations of Rectangle Datasets Supporting Orthogonal Range Querying
, 2012
"... The increasing use of geographic search engines manifests the interest of Internet users in geo-located resources and, in general, in geographic information. This has emphasized the importance of the development of efficient indexes over large geographic databases. The most common simplification of ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The increasing use of geographic search engines manifests the interest of Internet users in geo-located resources and, in general, in geographic information. This has emphasized the importance of the development of efficient indexes over large geographic databases. The most common simplification of geographic objects used for indexing purposes is a two-dimensional rectangle. Furthermore, one of the primitive operations that must be supported by every geographic index structure is the orthogonal range query, which retrieves all the geographic objects that have at least one point in common with a rectangular query region. In this work, we study several space-efficient representations of rectangle datasets that can be used in the development of geographic indexes supporting orthogonal range queries.
The wavelet matrix: An efficient wavelet tree for large alphabets
- Information Systems
"... The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the wavelet matrix, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zero-order entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map. 1
New Algorithms on Wavelet Trees and Applications to Information Retrieval
, 2011
"... Wavelet trees are widely used in the representation of sequences, permutations, text collections, binary relations, discrete points, and other succinct data structures. We show, however, that this still falls short of exploiting all of the virtues of this versatile data structure. In particular we s ..."
Abstract
- Add to MetaCart
(Show Context)
Wavelet trees are widely used in the representation of sequences, permutations, text collections, binary relations, discrete points, and other succinct data structures. We show, however, that this still falls short of exploiting all of the virtues of this versatile data structure. In particular we show how to use wavelet trees to solve fundamental algorithmic problems such as range quantile queries, range next value queries, and range intersection queries. We explore several applications of these queries in Information Retrieval, in particular document retrieval in hierarchical and temporal documents, and in the representation of inverted lists.
Space-Efficient Representations of Rectangle Datasets Supporting Orthogonal Range Querying
"... The increasing use of geographic search engines manifests the interest of Internet users in geo-located resources and, in general, in geographic infor-mation. This has emphasized the importance of the development of efficient indexes over large geographic databases. The most common simplification of ..."
Abstract
- Add to MetaCart
(Show Context)
The increasing use of geographic search engines manifests the interest of Internet users in geo-located resources and, in general, in geographic infor-mation. This has emphasized the importance of the development of efficient indexes over large geographic databases. The most common simplification of geographic objects used for indexing purposes is a two-dimensional rectangle. Furthermore, one of the primitive operations that must be supported by ev-ery geographic index structure is the orthogonal range query, which retrieves all the geographic objects that have at least one point in common with a rectangular query region. In this work, we study several space-efficient rep-resentations of rectangle datasets that can be used in the development of geographic indexes supporting orthogonal range queries.
Efficient Compressed Wavelet Trees over Large Alphabets
, 2014
"... The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space ..."
Abstract
- Add to MetaCart
(Show Context)
The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the wavelet matrix, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zero-order entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map.