## Speeding-up Hirschberg and Hunt-Szymanski LCS Algorithms (2003)

Citations: | 5 - 0 self |

### BibTeX

@MISC{Crochemore03speeding-uphirschberg,

author = {Maxime Crochemore and Costas S. Iliopoulos and Yoan J. Pinzon},

title = {Speeding-up Hirschberg and Hunt-Szymanski LCS Algorithms},

year = {2003}

}

### OpenURL

### Abstract

Two algorithms are presented that solve the problem of recovering the longest common subsequence of two strings. The first algorithm is an improvement of Hirschberg’s divide-and-conquer algorithm. The second algorithm is an improvement of Hunt-Szymanski algorithm based on an efficient computation of all dominant match points. These two algorithms use bit-vector operations and are shown to work very efficiently in practice.

### Citations

1212 |
Binary codes capable of correcting deletions, insertions and reversals
- Levenshtein
- 1966
(Show Context)
Citation Context ...at w is a subsequence of both x and y of maximum possible length. The LCS problem is related to two well known metrics for measuring the similarity (distance) of two strings: the Levenshtein distance =-=[10-=-] and the edit distance [18]. The LCS problem can be solved in O(nm) time and space by a dynamic programming approach [16, 18]. The asymptotically fastest algorithm is due to Masek and Pa Partially su... |

658 | The string-to-string correction problem - Wagner, Fischer - 1974 |

318 | Fast text search allowing errors - Manber, Wu - 1992 |

274 |
A linear space algorithm for computing maximal common subsequences
- Hirschberg
- 1975
(Show Context)
Citation Context ...es the "four Russians" trick and takes O( n 2 log n ) time. Most other algorithms use either divide-andconquer or dominant-match-point paradigms. The divideand -conquer solutions is due to H=-=irschberg [7]-=- who presented a variation of the dynamic programing algorithm usingsO(n 2 ) time but only O(n) space. The dominant-matchpoint algorithms have complexity that depends on output parameters such as r, t... |

224 | A new approach to text searching
- Baeza-Yates, Gonnet
- 1992
(Show Context)
Citation Context ...in a computer word to speed up algorithms has been used extensively in the last few years. One of the simplest and best known algorithm is the Shift-And algorithm, originally by BaezaYates and Gonnet =-=[3]-=- and subsequently modified by Wu and Manber [19], that solves the exact pattern matching problem in O( nm w ), where n and m are the length of the two input strings and w the number of bits in a machi... |

177 | Algorithms for the longest common subsequence problem
- Hirschberg
- 1977
(Show Context)
Citation Context ... time but only O(n) space. The dominant-matchpoint algorithms have complexity that depends on output parameters such as r, the total number of matching pairs, and p, the length of the LCS. Hirschberg =-=[8]-=-, presented an O(pn) algorithm and, in the same year, Hunt and Szymanski [9] gave an O(r log n) algorithm. It is important to note that these two algorithms are efficient, particularly for cases when ... |

169 |
1980]. \A faster algorithm for computing string edit distances
- Masek, Paterson
(Show Context)
Citation Context ...est algorithm is due to Masek and Pa Partially supported by a Marie Curie fellowship, NATO, Wellcome and Royal Society grants. y Partially supported by an ORS studentship and EPSRC GR/L92150. terson [=-=11] that uses-=- the "four Russians" trick and takes O( n 2 log n ) time. Most other algorithms use either divide-andconquer or dominant-match-point paradigms. The divideand -conquer solutions is due to Hir... |

167 |
A fast algorithm for computing longest common subsequences
- Hunt, Szymanski
- 1977
(Show Context)
Citation Context ...y that depends on output parameters such as r, the total number of matching pairs, and p, the length of the LCS. Hirschberg [8], presented an O(pn) algorithm and, in the same year, Hunt and Szymanski =-=[9]-=- gave an O(r log n) algorithm. It is important to note that these two algorithms are efficient, particularly for cases when r and p are small. In the worst case p = n and r = n 2 , thus, O(pn) becomes... |

139 | A fast bit-vector algorithm for approximate string matching based on dynamic programming
- Myers
- 1999
(Show Context)
Citation Context ...rings and w the number of bits in a machine word (normally 32 or 64). Navarro and Raffinot [14] obtained a fast exact matching algorithm combining bit-parallelism and suffix automata. Recently, Myers =-=[12]-=- developed a competitive algorithm that computes the edit distance of two strings in O( nm w ) time. Crochemore et al. [4, 5] gave an O( n 2 w ) algorithm to compute the length of the LCS using O( n w... |

55 |
1987]. \The longest common subsequence problem revisited
- Apostolico, Guerra
(Show Context)
Citation Context ...if we can efficiently compute all the dominant match points, then it would be possible to speed up HS algorithm by substituting MATCHLIST for the list of dominant match points KMATCHLIST . Apostolico =-=[1, 2]-=- exploits this notion and gave an O((q + n) log n) variant of the O((r + n) log n) HS algorithm. However, the gain is obtained at expense of complicated data structures such as balanced binary search ... |

48 | A subquadratic algorithm for approximate limited expression matching - Wu, Manber, et al. - 1996 |

43 | Eds). Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison - Sankoff, Kruskal - 1983 |

40 | A bit-parallel approach to suffix automata: Fast extended string matching
- Navarro, Raffinot
- 1998
(Show Context)
Citation Context ...that solves the exact pattern matching problem in O( nm w ), where n and m are the length of the two input strings and w the number of bits in a machine word (normally 32 or 64). Navarro and Raffinot =-=[14]-=- obtained a fast exact matching algorithm combining bit-parallelism and suffix automata. Recently, Myers [12] developed a competitive algorithm that computes the edit distance of two strings in O( nm ... |

29 | Longest common subsequences - Paterson, Dančík - 1994 |

26 | A fast and practical bit-vector algorithm for the Longest Common Subsequence problem
- Crochemore, Iliopoulos, et al.
(Show Context)
Citation Context ...ing algorithm combining bit-parallelism and suffix automata. Recently, Myers [12] developed a competitive algorithm that computes the edit distance of two strings in O( nm w ) time. Crochemore et al. =-=[4, 5]-=- gave an O( n 2 w ) algorithm to compute the length of the LCS using O( n w ) space. In this paper we extend this algorithm to obtain faster divide-andconquer Hirschberg algorithm and Hunt-Szymanski a... |

26 |
A longest common subsequence algorithm suitable for similar text strings
- Nakatsu, Kambayashi, et al.
- 1982
(Show Context)
Citation Context ... r and p are small. In the worst case p = n and r = n 2 , thus, O(pn) becomes O(n 2 ) and O(r log n) becomessO(n 2 log n) which is even worse than the dynamic programming algorithm. As pointed out in =-=[13]-=- we have to select one of the algorithmssa priori depending on the kind of sequences we wish to compare. This might prove to be difficult because of insufficient knowledge of the nature of sequences t... |

19 | Expected Length of Longest Common Subsequences - Dančík - 1994 |

12 |
Improving the worst-case performance of the Hunt-Szymanski strategy for the longest common subsequence of two strings
- Apostolico
(Show Context)
Citation Context ... do Ss(S + (S & M [y j ]))|(S & M 0 [y j ]) 8 if Sm+1 = 1 9 then L[j]sL[j 1] + 1 10 else L[j]sL[j 1] H(x;y; m; n; C; p) 1 if n = 0 2 then ps0 3 else if m = 1 4 then if 9j with x i = y j 5 then ps1 6 C=-=[1]-=-sx 1 7 else ps0 8 else isd m 2 e 9 FINDROW(x; y; i; n; L) 10 FINDROW(x R ; y R ; m i; n; L R ) 11 ks0 12 maxs0 13 for `s0 to n 14 do if L[`] + L R [n `] > max 15 then maxsL[`] + L R [n `] 16 ks` 17 H(... |

4 | diversions and maximal chains in partially ordered sets - Sankoff, Sellers, et al. - 1973 |

2 |
Expected length of longest common subsequences
- Danck
- 1994
(Show Context)
Citation Context ...h consisting of symbols drawn from a small alphabet in a more or less uniform manner, the length of the LCS can be expected to lie in the range between 1 3 n and 2 3 n, depending on the alphabet size =-=[6, 15, 16]-=-. The bit-vector algorithms presented in this paper has the advantage of not being input- or output-sensitive, i.e. it is independent on any parameter other than the length of the sequences. The idea ... |

1 |
A fast bit-vector algorithm for the longest common subsequence problem
- Crochemore, Iliopoulos, et al.
- 2001
(Show Context)
Citation Context ...ing algorithm combining bit-parallelism and suffix automata. Recently, Myers [12] developed a competitive algorithm that computes the edit distance of two strings in O( nm w ) time. Crochemore et al. =-=[4, 5]-=- gave an O( n 2 w ) algorithm to compute the length of the LCS using O( n w ) space. In this paper we extend this algorithm to obtain faster divide-andconquer Hirschberg algorithm and Hunt-Szymanski a... |

1 | A bit-parallel approach to suf£x automata: fast extended string matching - Navarro, Raf£not - 1998 |