## Position-restricted substring searching (2006)

### Cached

### Download Links

- [www.dcc.uchile.cl]
- [www.dcc.uchile.cl]
- [www.cs.helsinki.fi]
- DBLP

### Other Repositories/Bibliography

Venue: | OF LECTURE NOTES IN COMPUTER SCIENCE |

Citations: | 19 - 3 self |

### BibTeX

@INPROCEEDINGS{Mäkinen06position-restrictedsubstring,

author = {Veli Mäkinen and Gonzalo Navarro},

title = {Position-restricted substring searching},

booktitle = {OF LECTURE NOTES IN COMPUTER SCIENCE},

year = {2006},

pages = {703--714},

publisher = {Springer}

}

### Years of Citing Articles

### OpenURL

### Abstract

A full-text index is a data structure built over a text string T[1, n]. The most basic functionality provided is (a) counting how many times a pattern string P[1, m] appears in T and (b) locating all those occ positions. There exist several indexes that solve (a) in O(m) time and (b) in O(occ) time. In this paper we propose two new queries, (c) counting how many times P[1, m] appears in T[l, r] and (d) locating all those occl,r positions. These can be solved using (a) and (b) but this requires O(occ) time. We present two solutions to (c) and (d) in this paper. The first is an index that requires O(n log n) bits of space and answers (c) in O(m + log n) time and (d) in O(log n) time per occurrence (that is, O(occl,r log n) time overall). A variant of the first solution answers (c) in O(m + log log n) time and (d) in constant time per occurrence, but requires O(nlog 1+ǫ n) bits of space for any constant ǫ> 0. The second solution requires O(nm log σ) bits of space, solving (c) in O(m⌈log σ/log log n⌉) time and (d) in O(m⌈log σ/log log n⌉) time per

### Citations

646 | Su#x arrays: a new method for on-line string searches
- Manber, Myers
- 1993
(Show Context)
Citation Context ...ns in T. There are several classical full-text indexes requiring O(n log n) bits of space which can answer counting queries in O(m) time (like suffix trees [2]) or O(m+log n) time (like suffix arrays =-=[14]-=-). Both locate each occurrence in constant time once the counting is done. Similar complexities are obtained with modern compressed data structures [5, 10,7], requiring space nHk(T)+o(n log σ) bits, w... |

193 | High-order entropy-compressed text indexes
- Grossi, Gupta, et al.
- 2003
(Show Context)
Citation Context ...ees [2]) or O(m+log n) time (like suffix arrays [14]). Both locate each occurrence in constant time once the counting is done. Similar complexities are obtained with modern compressed data structures =-=[5, 10,7]-=-, requiring space nHk(T)+o(n log σ) bits, where Hk(T) ≤ log σ is the k-th order empirical entropy of T. 3 In this paper we introduce a new problem, position restricted substring searching, which consi... |

193 | Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets
- Raman, Raman, et al.
(Show Context)
Citation Context ...sequences, apart from 3 In this paper log stands for log2 .the simple n + o(n) bits data structures [12,4, 16], there are others that answer rank and select in constant time using nH0(S) + o(n) bits =-=[18]-=-. A natural generalization of the above problem is substring rank and select. For a string s, ranks(S, i) is the number of occurrences of s in S[1, i], and selects(S, j) is the starting position of th... |

180 | Opportunistic Data Structures with Application
- Ferrragina, Manzini
- 2000
(Show Context)
Citation Context ...ees [2]) or O(m+log n) time (like suffix arrays [14]). Both locate each occurrence in constant time once the counting is done. Similar complexities are obtained with modern compressed data structures =-=[5, 10,7]-=-, requiring space nHk(T)+o(n log σ) bits, where Hk(T) ≤ log σ is the k-th order empirical entropy of T. 3 In this paper we introduce a new problem, position restricted substring searching, which consi... |

173 | Compressed full-text indexes
- Navarro, Mäkinen
(Show Context)
Citation Context ...ry in constant time per occurrence. Smaller and slower. Alternatively, it is possible to replace the suffix array A and its lcp information by any of the wealth of existing compressed data structures =-=[17]-=-. For example, by using the LZ-index of Ferragina and Manzini [6] we obtain n log n(1 + o(1)) + O(nHk(T)log γ n) bits of space (for any γ > 0 and any k = O(log σ log n)) and the same time complexities... |

170 |
Space-efficient static trees and graphs
- JACOBSON
- 1989
(Show Context)
Citation Context ..., 8]. They can also be solved in O(log σ) time using wavelet trees [10,11]. For the case of binary sequences, apart from 3 In this paper log stands for log2 .the simple n + o(n) bits data structures =-=[12,4, 16]-=-, there are others that answer rank and select in constant time using nH0(S) + o(n) bits [18]. A natural generalization of the above problem is substring rank and select. For a string s, ranks(S, i) i... |

132 | A functional approach to data structures and its use in multidimensional searching
- Chazelle
- 1988
(Show Context)
Citation Context ...a counting query plus the time to locate one occurrence (type (d)). As a byproduct, we present a space-efficient implementation of a well-known two-dimensional range search data structure by Chazelle =-=[3]-=-. We show in particular how the fractional cascading information (which is simulated rather than stored in Chazelle’s data structure) can be represented by constant-time rank queries on bit arrays. We... |

132 | An analysis of the Burrows-Wheeler transform
- Manzini
(Show Context)
Citation Context ...rns of length m ≤ logσ n. Actually, we show that this structure can be smaller for compressible texts, taking n ∑t−1 k=0 Hk(T) instead of nt logσ, where Hk(T) is the k-th order empirical entropy of T =-=[15,10]-=-. This is a lower bound to the number of bits per character achievable by any compressor that considers contexts of length k to model T. Structure. Our structure indexes the positions of all the t-gra... |

116 |
The myriad virtues of subword trees
- Apostolico
- 1985
(Show Context)
Citation Context ...nces of P in T; (b) locate those occ positions in T. There are several classical full-text indexes requiring O(n log n) bits of space which can answer counting queries in O(m) time (like suffix trees =-=[2]-=-) or O(m+log n) time (like suffix arrays [14]). Both locate each occurrence in constant time once the counting is done. Similar complexities are obtained with modern compressed data structures [5, 10,... |

112 | Compressed representations of sequences and full-text indexes
- Ferragina, Manzini, et al.
(Show Context)
Citation Context ...h queries can be answered in constant time using data structures that require nH0(S)+o(n) bits of space if the alphabet of the sequence is σ = O(polylog(n)), or in O(log σ/ log log n) time in general =-=[9, 8]-=-. They can also be solved in O(log σ) time using wavelet trees [10,11]. For the case of binary sequences, apart from 3 In this paper log stands for log2 .the simple n + o(n) bits data structures [12,... |

93 |
Compact Pat Trees
- Clark
- 1996
(Show Context)
Citation Context ..., 8]. They can also be solved in O(log σ) time using wavelet trees [10,11]. For the case of binary sequences, apart from 3 In this paper log stands for log2 .the simple n + o(n) bits data structures =-=[12,4, 16]-=-, there are others that answer rank and select in constant time using nH0(S) + o(n) bits [18]. A natural generalization of the above problem is substring rank and select. For a string s, ranks(S, i) i... |

68 | New data structures for orthogonal range searching
- Alstrup, Brodal, et al.
- 2000
(Show Context)
Citation Context ... A) such that A[i] is an occurrence. Larger and faster. It is possible to improve the locating time to O(1) by using slightly more space. Instead of the structure of Section 2, that of Alstrup et al. =-=[1]-=- can be used to index the points (i, A[i]). This structure retrieves the occl,r occurrences of a range query in O(log log n + occl,r) time. In exchange, it needs O(n log 1+ǫ n) bits of space, for any ... |

43 | When indexing equals compression: experiments with compressing suffix arrays and applications - Grossi, Gupta, et al. - 2004 |

43 | An alphabet-friendly FM-index
- Ferragina, Manzini, et al.
- 2004
(Show Context)
Citation Context ...ees [2]) or O(m+log n) time (like suffix arrays [14]). Both locate each occurrence in constant time once the counting is done. Similar complexities are obtained with modern compressed data structures =-=[5, 10,7]-=-, requiring space nHk(T)+o(n log σ) bits, where Hk(T) ≤ log σ is the k-th order empirical entropy of T. 3 In this paper we introduce a new problem, position restricted substring searching, which consi... |

27 | Repetition-based text indexes
- Kärkkäinen
- 1999
(Show Context)
Citation Context ...e’s data structure. 2 Two-Dimensional Range Searching In this section we describe a range search data structure to query by rectangular areas. The structure is a succinct variant of one from Chazelle =-=[3,13]-=- where we have completely removed binary searching and fractional cascading and have replaced them by constant-time rank queries over bit arrays. Given a set of points in [1, n] × [1, n], the data str... |

22 |
On compressing and indexing data
- Ferragina, Manzini
(Show Context)
Citation Context ...ely, it is possible to replace the suffix array A and its lcp information by any of the wealth of existing compressed data structures [17]. For example, by using the LZ-index of Ferragina and Manzini =-=[6]-=- we obtain n log n(1 + o(1)) + O(nHk(T)log γ n) bits of space (for any γ > 0 and any k = O(log σ log n)) and the same time complexities. On the other hand, we can use the alphabet-friendly FM-index of... |

12 | Succinct representation of sequences
- Navarro, Ferragina, et al.
- 2004
(Show Context)
Citation Context ...h queries can be answered in constant time using data structures that require nH0(S)+o(n) bits of space if the alphabet of the sequence is σ = O(polylog(n)), or in O(log σ/ log log n) time in general =-=[9, 8]-=-. They can also be solved in O(log σ) time using wavelet trees [10,11]. For the case of binary sequences, apart from 3 In this paper log stands for log2 .the simple n + o(n) bits data structures [12,... |