Results 1 
5 of
5
A linear size index for approximate pattern matching
 In Proc. 17th Annual Symposium on Combinatorial Pattern Matching
, 2006
"... Abstract. This paper revisits the problem of indexing a text S[1..n]to support searching substrings in S that match a given pattern P[1..m] with at most k errors. A naive solution either has a worstcase matching time complexity of Ω(m k)orrequiresΩ(n k) space. Devising a solution with better perfor ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
Abstract. This paper revisits the problem of indexing a text S[1..n]to support searching substrings in S that match a given pattern P[1..m] with at most k errors. A naive solution either has a worstcase matching time complexity of Ω(m k)orrequiresΩ(n k) space. Devising a solution with better performance has been a challenge until Cole et al. [5] showed an O(nlog k n)space index that can support kerror matching in O(m+occ+log k nlog log n) time, where occ is the number of occurrences. Motivated by the indexing of DNA, we investigate in this paper the feasibility of devising a linearsize index that still has a time complexity linear in m. In particular, we give an O(n)space index that supports kerror matching in O(m + occ +(logn) k(k+1) log log n) worstcase time. Furthermore, the index can be compressed from O(n) wordsintoO(n) bits with a slight increase in the time complexity. 1
Approximate String Matching with LempelZiv Compressed Indexes
, 2007
"... A compressed fulltext selfindex for a text T is a data structure requiring reduced space and able of searching for patterns P in T. Furthermore, the structure can reproduce any substring of T, thus it actually replaces T. Despite the explosion of interest on selfindexes in recent years, there has ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
A compressed fulltext selfindex for a text T is a data structure requiring reduced space and able of searching for patterns P in T. Furthermore, the structure can reproduce any substring of T, thus it actually replaces T. Despite the explosion of interest on selfindexes in recent years, there has not been much progress on search functionalities beyond the basic exact search. In this paper we focus on indexed approximate string matching (ASM), which is of great interest, say, in computational biology applications. We present an ASM algorithm that works on top of a LempelZiv selfindex. We consider the socalled hybrid indexes, which are the best in practice for this problem. We show that a LempelZiv index can be seen as an extension of the classical qsamples index. We give new insights on this type of index, which can be of independent interest, and then apply them to the LempelZiv index. We show experimentally that our algorithm has a competitive performance and provides a useful spacetime tradeoff compared to classical indexes.
Approximate String Matching with ZivLempel Compressed Indexes
"... Abstract. A compressed fulltext selfindex for a text T is a data structure requiring reduced space and able of searching for patterns P in T. Furthermore, the structure can reproduce any substring of T, thus it actually replaces T. Despite the explosion of interest on selfindexes in recent years, ..."
Abstract
 Add to MetaCart
Abstract. A compressed fulltext selfindex for a text T is a data structure requiring reduced space and able of searching for patterns P in T. Furthermore, the structure can reproduce any substring of T, thus it actually replaces T. Despite the explosion of interest on selfindexes in recent years, there has not been much progress on search functionalities beyond the basic exact search. In this paper we focus on indexed approximate string matching (ASM), which is of great interest, say, in computational biology applications. We present an ASM algorithm that works on top of a LempelZiv selfindex. We consider the socalled hybrid indexes, which are the best in practice for this problem. We show that a LempelZiv index can be seen as an extension of the classical qsamples index. We give new insights on this type of index, which can be of independent interest, and then apply them to the ZivLempel index. We show experimentally that our algorithm has a competitive performance and provides a useful spacetime tradeoff compared to classical indexes. 1 Introduction and Related Work Approximate string matching (ASM) is an important problem that arises in applications related to text searching, pattern recognition, signal processing, and computational biology,