## Efficient algorithms for (δ, γ, α)-matching

Citations: | 1 - 0 self |

### BibTeX

@MISC{Fredriksson_efficientalgorithms,

author = {Kimmo Fredriksson and Szymon Grabowski},

title = {Efficient algorithms for (δ, γ, α)-matching},

year = {}

}

### OpenURL

### Abstract

Abstract. We propose new algorithms for (δ, γ, α)-matching. In this string matching problem we are given a pattern P = p0p1... pm−1 and a text T = t0t1... tn−1 over some integer alphabet Σ = {0... σ − 1}. The pattern symbol pi matches the text symbol tj iff |pi − tj | ≤ δ. The pattern P (δ, γ)-matches some text substring tj... tj+m−1 iff for all i it holds that |pi − tj+i | ≤ δ and�|pi − tj+i | ≤ γ. Finally, in (δ, γ, α)-matching we also permit at most α length gaps (text substrings) between each matching text symbol. The only known previous algorithm runs in O(mn) time. We give several algorithms that improve the average case up to O(n) for small α, and the worst case to O(min{mn, |M|α}) or O(mn log γ/w), where M = {(i, j) | |pi − tj | ≤ δ} and w is the number of bits in a machine word. We conclude with experimental results showing that the algorithms are very efficient in practice. Key words: approximate string matching, music information retrieval, bit-parallelism, sparse dynamic programming 1

### Citations

44 |
priority queue in which initialization and queue operations take o(loglogd) time
- Johnson, “A
- 1982
(Show Context)
Citation Context ... time and O(n(δσp/σ+1)) average case time for integer alphabets (see Sec. 5). Having M available, we can avoid the brute force scanning for δ-matches. M can be stored e.g. in Johnson’s data structure =-=[10]-=- which supports a homogeneous sequence of insertions and successor searches in O(log log(mn/|M|)) time. This gives O(|M| log log(mn/|M|)) worst case time, but destroys the good average case because of... |

31 |
Decision trees and random access machines
- Paul, Simon
- 1980
(Show Context)
Citation Context ...nd y, we compute x ← vmin(x, (x << (ℓ+1)) | (γ +1)) and repeat that α times, and then perform the final shift x ← x << (ℓ+1), which gives the desired result. The minimization can be done in O(1) time =-=[14]-=-, see Alg. 3. The total time for computing M(x) is then O(α). This can be easily improved to O(log α). Without loss of generality assume that α is a power of two. Instead of shifting one counter posit... |

24 | Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching
- Navarro, Raffinot
(Show Context)
Citation Context ...e reasonable in most practical cases. Previous work. There are many algorithms that solve some restricted variant of (δ,γ,α)-matching, such as δ-matching [3], (δ,γ)-matching [5, 6] and (δ,α)-matching =-=[13, 1, 2, 8]-=-. There are also algorithms that allow transpositions and insertions and deletions of symbols simultaneously with (δ,γ) or (δ,α)-matching [11, 12]. However, none of these algorithms can handle (δ,γ,α)... |

22 |
Transposition invariant string matching
- Makinen, Navarro, et al.
- 2005
(Show Context)
Citation Context ...ng [3], (δ,γ)-matching [5, 6] and (δ,α)-matching [13, 1, 2, 8]. There are also algorithms that allow transpositions and insertions and deletions of symbols simultaneously with (δ,γ) or (δ,α)-matching =-=[11, 12]-=-. However, none of these algorithms can handle (δ,γ,α)-matching. We are aware of only one algorithm for (δ,γ,α)-matching problem [4]. This is based on dynamic programming, and runs in O(nm) time. ⋆ Su... |

10 | Tsichlas: Approximate string matching with gaps
- Crochemore, Iliopoulos, et al.
(Show Context)
Citation Context ...string matching literature, usually motivated by some real problems. One of seemingly underexplored problems with applications in music information retrieval and molecular biology is (δ,γ,α)-matching =-=[4]-=- and its variations. In this problem, the pattern p0p1 ...pm−1 is allowed to match a substring of the text t0t1 ...tn−1 with αlimited gaps, and the respective pairs of matching characters’ numerical v... |

9 |
Parameterized approximate string matching and local-similarity-based pointpattern matching
- Mäkinen
- 2003
(Show Context)
Citation Context ...ng [3], (δ,γ)-matching [5, 6] and (δ,α)-matching [13, 1, 2, 8]. There are also algorithms that allow transpositions and insertions and deletions of symbols simultaneously with (δ,γ) or (δ,α)-matching =-=[11, 12]-=-. However, none of these algorithms can handle (δ,γ,α)-matching. We are aware of only one algorithm for (δ,γ,α)-matching problem [4]. This is based on dynamic programming, and runs in O(nm) time. ⋆ Su... |

8 |
Salinger: Bit-parallel (δ,γ)-matching suffix automata
- Crochemore, Iliopoulos, et al.
(Show Context)
Citation Context ...thm. Each matrix cell is represented with ℓ = ⌈log 2(2γ + 1)⌉ (5) bits, and number zero is represented (using ℓ bits) as 2 ℓ−1 −(γ+1). This representation has been used before e.g. for (δ,γ)-matching =-=[5]-=-. We still need an additional bit per cell, and hence each machine word packs C = ⌊w/(ℓ + 1)⌋ (6) cells, or counters. This representation solves three problems we are going to face shortly: (i) counte... |

7 | Efficient algorithms for pattern matching with general gaps, character classes, and transposition invariance
- Fredriksson, Grabowski
(Show Context)
Citation Context ...eing that we need m queues, since we are computing column-wise (as opposed to row-wise in [4]). 4 Simple algorithm In this section we will develop a variant of the Simple algorithm for (δ,α)-matching =-=[7]-=-. This performs very well on small (δ,γ,α). 31Proceedings of the Prague Stringology Conference ’06 Alg. 2 Simple(T,n,P,m,δ,γ,α). 1 h ← 0 2 for i ← 0 to n − 1 do 3 M[i] ← γ + 1 4 d ← |T[i] − P[0]| 5 i... |

6 |
An efficient algorithm for δ-approximate matching with α-bounded gaps in musical sequences
- Cantone, Cristofaro, et al.
- 2005
(Show Context)
Citation Context ...e reasonable in most practical cases. Previous work. There are many algorithms that solve some restricted variant of (δ,γ,α)-matching, such as δ-matching [3], (δ,γ)-matching [5, 6] and (δ,α)-matching =-=[13, 1, 2, 8]-=-. There are also algorithms that allow transpositions and insertions and deletions of symbols simultaneously with (δ,γ) or (δ,α)-matching [11, 12]. However, none of these algorithms can handle (δ,γ,α)... |

5 |
On tuning the (δ,α)-sequential-sampling algorithm for δ-approximate matching with α-bounded gaps in musical sequences
- Cantone, Cristofaro, et al.
- 2005
(Show Context)
Citation Context ...e reasonable in most practical cases. Previous work. There are many algorithms that solve some restricted variant of (δ,γ,α)-matching, such as δ-matching [3], (δ,γ)-matching [5, 6] and (δ,α)-matching =-=[13, 1, 2, 8]-=-. There are also algorithms that allow transpositions and insertions and deletions of symbols simultaneously with (δ,γ) or (δ,α)-matching [11, 12]. However, none of these algorithms can handle (δ,γ,α)... |

4 | Flexible music retrieval in sublinear time
- Fredriksson, Mäkinen, et al.
(Show Context)
Citation Context ...ithout note durations), are reasonable in most practical cases. Previous work. There are many algorithms that solve some restricted variant of (δ,γ,α)-matching, such as δ-matching [3], (δ,γ)-matching =-=[5, 6]-=- and (δ,α)-matching [13, 1, 2, 8]. There are also algorithms that allow transpositions and insertions and deletions of symbols simultaneously with (δ,γ) or (δ,α)-matching [11, 12]. However, none of th... |

3 |
Efficient bit-parallel algorithms for (δ,α)-matching
- Grabowski
(Show Context)
Citation Context |

2 |
Rytter: Occurrence and substring heuristics for δ-matching
- Crochemore, Iliopoulos, et al.
(Show Context)
Citation Context ...pitch values only (without note durations), are reasonable in most practical cases. Previous work. There are many algorithms that solve some restricted variant of (δ,γ,α)-matching, such as δ-matching =-=[3]-=-, (δ,γ)-matching [5, 6] and (δ,α)-matching [13, 1, 2, 8]. There are also algorithms that allow transpositions and insertions and deletions of symbols simultaneously with (δ,γ) or (δ,α)-matching [11, 1... |

2 | Mäkinen: Parameterized approximate string matching and local-similarity-based point-pattern matching - unknown authors - 2003 |

1 | Fredriksson and Sz. Grabowski: Efficient algorithms for pattern matching with general gaps and character classes - unknown authors - 2006 |

1 |
Tarjan: Dequeus with heap order
- Gajewska, E
- 1986
(Show Context)
Citation Context ... O(n). For column-wise computation the space complexity is O(αm) as up to α + 1 columns have to be stored. As shown in [4] the time complexity can be improved to O(mn) using min-queue data structures =-=[9]-=-. However, in practical MIR applications α is usually so small that the simple brute-force evaluation is faster than using sophisticated data structures that have large (constant) overhead. Instead, w... |