#### DMCA

## Restricted Transposition Invariant Approximate String Matching Under Edit Distance

### Citations

832 |
The string-to-string correction problem.
- Wagner, Fischer
- 1974
(Show Context)
Citation Context ...]. We will assume for the remaining part if the paper that the notion of edit distance refers to the Levenshtein distance (e.g. ed(A, B) = edL(A, B)). 2 Dynamic programming The classic O(mn) solution =-=[19]-=- for computing ed(A, B) is to fill an (m + 1) × (n+1) dynamic programming table D, where the cell D[i, j] will eventually hold the value ed(A1..i, B1..j). Under Levenshtein distance it works as follow... |

598 | A guided tour to approximate string matching. - Navarro - 2001 |

297 |
On the theory and computation of evolutionary distances.
- Sellers
- 1974
(Show Context)
Citation Context ...nce for indel distance is very similar. The dynamic programming method can be modified to conduct approximate string matching by changing the boundary initialization rule D[0, j] = j into D[0, j] = 0 =-=[16]-=-.s2.1 Greedy filling order Let diagonal q be the up-left to low-right diagonal in D whose cells D[i, j] satisfy j −i = q. Ukkonen [17] proposed a greedy algorithm for computing edit distance. The cell... |

225 |
Algorithms for approximate string matching.
- Ukkonen
- 1985
(Show Context)
Citation Context ...ing the boundary initialization rule D[0, j] = j into D[0, j] = 0 [16].s2.1 Greedy filling order Let diagonal q be the up-left to low-right diagonal in D whose cells D[i, j] satisfy j −i = q. Ukkonen =-=[17]-=- proposed a greedy algorithm for computing edit distance. The cells in D are filled in the order of increasing distance values 0, 1, . . .. The algorithm is based on the well-known facts that the valu... |

218 | Algorithms for the longest common subsequence problem.
- Hirschberg
- 1977
(Show Context)
Citation Context ...nts D[i, j] where Ai = Bj. Let M(t) = {(i, j) | Ai + t = Bj} be the set of matching points under transposition t. A single set M(t) can be represented in linear space and generated in O(n log n) time =-=[5]-=-. By following [13], the sets M(t) can be computedsfor all relevant transpositions in O(σ + mn) time. There are |M(t)| = O(mn) matching points under a given transposition t. The overall number of matc... |

212 |
Scaling and related techniques for geometry problems.
- Gabow, Bentley, et al.
- 1984
(Show Context)
Citation Context ...,3). The corresponding values D[i ′ , j ′ ] + max{i − i ′ , j − j ′ } are 6, 5, 3, 2, and 3, respectively. The minimum value 2 corresponds to (i ′ , j ′ ) = (3, 4). This leads into setting D[5, 6] = D=-=[3, 4]-=- + 2 − 1 = 3. 1 A transposition is relevant if it leads into at least one match between A and B.sGalil and Park [4], by following the framework of Eppstein et al. [2], discussed a scheme that is able ... |

132 | Binary codes capable of correcting spurious insertions and deletions of ones”, - Levenshtein - 1965 |

124 | Fast parallel and serial approximate string matching. - Landau, Vishkin - 1989 |

104 | Preserving order in a forest in less than logarithmic time and linear space, - Boas - 1977 |

69 |
Sublinear approximate string matching and biological applications”,
- Chang, Lawler
- 1994
(Show Context)
Citation Context ...and lcp(i, j) is the length of the longest common prefix between Ai..m and Bj..n. Fig. 1 shows an example. The value lcp(i, j) can be computed in constant time by using the method of Chang and Lawler =-=[1]-=-, which requires O(n) time preprocessing. Hence the greedy algorithm is able to compute d = ed(A, B) in O(n + d 2 ) time, as at most O(d) diagonals are processed and a single diagonal involves O(d) co... |

53 | Sparse dynamic programming I: Linear cost functions
- Eppstein, Galil, et al.
- 1992
(Show Context)
Citation Context ... 1 0 1 2 1 1 3 4 5 2 S P I R E S T A I R 0 1 2 3 4 5 1 0 1 2 1 1 2 3 2 2 4 5 2 Fig. 1. Assume the values L[1, −1] = 2, L[1, 0] = 2, and L[1, 1] = 1 are already known. (Left) When computing the value L=-=[2, 0]-=-, we have q = 0, s = max{L[1, −1], L[1, 0] + 1, L[1, 1] + 1} = 3, and lcp(s + 1, q + s + 1) = lcp(4, 4) = 0. Hence L[2, 0] = 3. (Right) When computing the value L[2, 1], we have q = 1, s = max{L[1, 0]... |

49 |
A Priority Queue in Which Initialization and Queue Operations Take O(log log D) Time
- Johnson
- 1982
(Show Context)
Citation Context ...4), and (4,3). The corresponding values D[i ′ , j ′ ] + max{i − i ′ , j − j ′ } are 6, 5, 3, 2, and 3, respectively. The minimum value 2 corresponds to (i ′ , j ′ ) = (3, 4). This leads into setting D=-=[5, 6]-=- = D[3, 4] + 2 − 1 = 3. 1 A transposition is relevant if it leads into at least one match between A and B.sGalil and Park [4], by following the framework of Eppstein et al. [2], discussed a scheme tha... |

47 | E.: A comparison of approximate string matching algorithms.
- Jokinen, Tarhio, et al.
- 1996
(Show Context)
Citation Context ...hnique for restricting the computation with transposition invariant edit distance. The first building block is the following Lemma that is essentially similar to the idea of so-called counting filter =-=[7, 14]-=-. Lemma 2. Let A and B be two strings and D be a corresponding dynamic programming table that has been filled as described in section 2. The condition D[i, j] ≤ k can hold only if the substring Bj−h+1... |

33 | Including interval encoding into edit distance based music comparison and retrieval.
- Lemstrom, Ukkonen
- 2000
(Show Context)
Citation Context ...tween A and B, and searching for approximate occurrences of A inside B. We consider the classic Levenshtein distance, but the discussion is applicable also to indel distance. A relatively new variant =-=[8]-=- of string matching, motivated initially by the nature of string matching in music, is to allow transposition invariance for A. This means allowing A to be “shifted” by adding some fixed integer t to ... |

32 | Dynamic programming with convexity, concavity and sparsity,
- Galil, Park
- 1992
(Show Context)
Citation Context ...i ′ , j ′ ) ≺ (i, j) mean that i ′ < i and j ′ < j, and we also say that in this case (i ′ , j ′ ) precedes (i, j). The following sparse scheme for Levenshtein distance is adapted from Galil and Park =-=[4]-=-. First we initialize D[0, 0] = 0. Then each value D[i, j] where Ai = Bj is computed recursively by setting D[i, j] = min{D[i ′ , j ′ ] + max{i − i ′ , j − j ′ } | (i ′ , j ′ ) ∈ M(t) ∪ (0, 0) ∧ (i ′ ... |

29 | Transposition Invariant String Matching
- Mäkinen, Navarro, et al.
(Show Context)
Citation Context ...is means allowing A to be “shifted” by adding some fixed integer t to the values of all its characters: the underlying string matching task must then consider all possible values of t. Mäkinen et al. =-=[12, 13]-=- have recently proposed O(mn log log m) and O(dn log log m) algorithms for transposition invariant edit distance computation, where d is the transposition invariant distance between A and B, and an O(... |

24 | Mutiple approximate string matching by counting”,
- Navarro
- 1997
(Show Context)
Citation Context ...hnique for restricting the computation with transposition invariant edit distance. The first building block is the following Lemma that is essentially similar to the idea of so-called counting filter =-=[7, 14]-=-. Lemma 2. Let A and B be two strings and D be a corresponding dynamic programming table that has been filled as described in section 2. The condition D[i, j] ≤ k can hold only if the substring Bj−h+1... |

11 | Practical algorithms for transposition-invariant string-matching.
- Lemstrom, Navarro, et al.
- 2005
(Show Context)
Citation Context ...s we propose, are concentrated on how to compute each distance ed(A+t, B) efficiently [12, 13], and on building heuristics on how to quickly discard possible transpositions from further consideration =-=[9]-=-. We will assume for the remaining part if the paper that the notion of edit distance refers to the Levenshtein distance (e.g. ed(A, B) = edL(A, B)). 2 Dynamic programming The classic O(mn) solution [... |