Results 1 
3 of
3
Spaceefficient construction of LempelZiv compressed text indexes
, 2009
"... Abstract. A compressed fulltext selfindex is a data structure that replaces a text and in addition gives indexed access to it, while taking space proportional to the compressed text size. This is very important nowadays, since one can accommodate the index of very large texts entirely in main memo ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. A compressed fulltext selfindex is a data structure that replaces a text and in addition gives indexed access to it, while taking space proportional to the compressed text size. This is very important nowadays, since one can accommodate the index of very large texts entirely in main memory, avoiding the slower access to secondary storage. In particular, the LZindex [G. Navarro, Journal of Discrete Algorithms, 2004] stands out for its good performance at extracting text passages and locating pattern occurrences. Given a text T[1..u] over an alphabet of size σ, the LZindex requires 4uHk(T) + o(u log σ) bits of space, where Hk(T) is the kth order empirical entropy of T. Although in practice the LZindex needs 1.01.5 times the text size, its construction requires much more main memory (around 5 times the text size), which limits its applicability only to not so large texts. In this paper we present an spaceefficient algorithm to construct the LZindex in O(u(log σ + log log u)) time and requiring 4uHk(T)+o(ulog σ) bits of space. Our experimental results show that our method is efficient in practice, needing an amount of memory close to that of the final index, and outperforming by far the construction time of other compressed indexes. We also adapt our algorithm to construct some recent reduced versions of the LZindex, showing that these can also be built without using extra space on top of that required by the final index. We study an alternative model in which we are given only a limited amount of main memory to carry out the indexing process (less than that required by the final index). We show how to build all the LZindex alternatives in
Lightweight data indexing and compression in external memory
 In Proc. 8th Latin American Symposium on Theoretical Informatics (LATIN
, 2010
"... Abstract. In this paper we describe algorithms for computing the BWT and for building (compressed) indexes in external memory. The innovative feature of our algorithms is that they are lightweight in the sense that, for an input of size n, they use only n bits of disk working space while all previou ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract. In this paper we describe algorithms for computing the BWT and for building (compressed) indexes in external memory. The innovative feature of our algorithms is that they are lightweight in the sense that, for an input of size n, they use only n bits of disk working space while all previous approaches use Θ(n log n) bits of disk working space. Moreover, our algorithms access disk data only via sequential scans, thus they take full advantage of modern disk features that make sequential disk accesses much faster than random accesses. We also present a scanbased algorithm for inverting the BWT that uses Θ(n) bits of working space, and a lightweight internalmemory algorithm for computing the BWT which is the fastest in the literature when the available working space is o(n) bits. Finally, we prove lower bounds on the complexity of computing and inverting the BWT via sequential scans in terms of the classic product: internalmemory space × number of passes over the disk data. 1
InPlace 2d Nearest Neighbor Search
, 2007
"... Abstract We revisit a classic problem in computational geometry: preprocessing a planar npoint set to answer nearest neighbor queries. In SoCG 2004, Br"onnimann, Chan, and Chen showed that it is possible to design an efficient data structure that takes no extra space at all other than the inpu ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract We revisit a classic problem in computational geometry: preprocessing a planar npoint set to answer nearest neighbor queries. In SoCG 2004, Br"onnimann, Chan, and Chen showed that it is possible to design an efficient data structure that takes no extra space at all other than the input array holding a permutation of the points. The best query time known for such "inplace data structures " is O(log 2 n). In this paper, we break the O(log 2 n) barrier by providing a method that answers nearest neighbor queries in time O((log n) log3=2 2 log log n) = O(log