## New Algorithms for the Longest Common Subsequence Problem (1994)

Citations: | 7 - 0 self |

### BibTeX

@TECHREPORT{Rick94newalgorithms,

author = {Claus Rick and Claus Rick and Claus Rick},

title = {New Algorithms for the Longest Common Subsequence Problem},

institution = {},

year = {1994}

}

### OpenURL

### Abstract

Given two sequences A = a 1 a 2 : : : am and B = b 1 b 2 : : : b n , m n, over some alphabet \Sigma, a common subsequence C = c 1 c 2 : : : c l of A and B is a sequence that can be obtained from both A and B by deleting zero or more (not necessarily adjacent) symbols. Finding a common subsequence of maximal length is called the Longest CommonSubsequence (LCS) Problem. Two new algorithms based on the well-known paradigm of computing minimal matches are presented. One runs in time O(ns+minfds; pmg) and the other runs in time O(ns +minfp(n \Gamma p); pmg) where s = j\Sigmaj is the alphabet size, p is the length of a longest common subsequence and d is the number of minimal matches. The ns term is charged by a standard preprocessing phase. When m n both algorithms are fast in situations when a LCS is expected to be short as well as in situations when a LCS is expected to be long. Further they show a much smaller degeneration in intermediate situations, especially the second al...

### Citations

193 | Algorithms for the longest common subsequence problem
- Hirschberg
- 1977
(Show Context)
Citation Context ...tches. This approach will be reviewed in Section 2 since it is the basis for the new algorithms presented in this paper, too. The first algorithms using this approach have been invented by Hirschberg =-=[11]-=- and Hunt/Szymanski [13] with processing time O(pn) and O(m+r log p) respectively. An additional O(n log s) term has to be added for both methods for a standard preprocessing phase. Later, both algori... |

182 |
A Faster Algorithm Computing String Edit Distances
- Masek, Paterson
- 1980
(Show Context)
Citation Context ... and let A = abcdbb and B = cbacbaaba be two input sequences over \Sigma. The above code computes the L-Matrix shown in Figure 1. The asymptotically fastest general solution takes time O(n 2 = log n) =-=[15] and uses -=-the "Four Russians" trick. A lot of algorithms have been developed that, although not improving the general O(mn) time bound of the dynamic programming approach, exhibit a much better perfor... |

125 |
The String to String Correction Problem
- Wagner, Fischer
- 1974
(Show Context)
Citation Context ...th of a LCS between A and B. 1.2 Previous results The first algorithm to solve the LCS Problem has been a dynamic programming approach that was discovered by several different scientist independently =-=[19, 21]-=-. They observed that the following recursion holds for L i;j (a detailed proof can be found in [7]). L i;j = 8 ? ! ? : 0 if i = 0 or j = 0 L i\Gamma1;j \Gamma1 + 1 if a i = b j maxfL i\Gamma1;j ; L i;... |

71 |
Leiserson C, Rivest R. Introduction to Algorithms
- Cormen
- 1990
(Show Context)
Citation Context ... a dynamic programming approach that was discovered by several different scientist independently [19, 21]. They observed that the following recursion holds for L i;j (a detailed proof can be found in =-=[7]-=-). L i;j = 8 ? ! ? : 0 if i = 0 or j = 0 L i\Gamma1;j \Gamma1 + 1 if a i = b j maxfL i\Gamma1;j ; L i;j \Gamma1 g if a i 6= b j By employing an array L[0::m; 0::n] initially filled with zeros and exec... |

56 |
The longest common subsequence problem revisited
- Apostolico, Guerra
- 1987
(Show Context)
Citation Context ...spectively. An additional O(n log s) term has to be added for both methods for a standard preprocessing phase. Later, both algorithms have been refined by Hsu and Du [12] and by Apostolico and Guerra =-=[3]-=- to take time O(pm log n p + pm) and O(pm log(minfs; m; 2n m g)) respectively. They perform well when a LCS is expected to be short. There is also an algorithm from Nakatsu et al. [17] that has runnin... |

47 | Sparse dynamic programming I: Linear cost functions - Eppstein, Galil, et al. - 1992 |

44 |
Kruskal (eds.). Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison
- Sankoff, B
- 1983
(Show Context)
Citation Context ...th of a LCS between A and B. 1.2 Previous results The first algorithm to solve the LCS Problem has been a dynamic programming approach that was discovered by several different scientist independently =-=[19, 21]-=-. They observed that the following recursion holds for L i;j (a detailed proof can be found in [7]). L i;j = 8 ? ! ? : 0 if i = 0 or j = 0 L i\Gamma1;j \Gamma1 + 1 if a i = b j maxfL i\Gamma1;j ; L i;... |

32 |
Bounds for the String Editing Problem
- Wong, Chandra
- 1976
(Show Context)
Citation Context ...Algorithms that use linear space are discussed in [9, 4, 14]. Table 1 gives a chronological survey of algorithms for the LCS Problem. Lower bounds on the complexity of the LCS Problem can be found in =-=[1, 10, 22]-=-. For a fixed alphabet of size s there is still a big gap between the linear lower bound\Omega\Gamma ns) and the worst case upper bounds of the various algorithms. 1.3 Mission statement As pointed out... |

31 | Longest common subsequences - Paterson, Danck - 1994 |

29 |
A longest common subsequence algorithm suitable for similar text strings
- Nakatsu, Kambayashi, et al.
- 1982
(Show Context)
Citation Context ...olico and Guerra [3] to take time O(pm log n p + pm) and O(pm log(minfs; m; 2n m g)) respectively. They perform well when a LCS is expected to be short. There is also an algorithm from Nakatsu et al. =-=[17]-=- that has running time O(n(m \Gamma p)) and that can be used if longest common subsequences of great length are expected. Other algorithms are from Chin and Poon [5] (O(ns + minfpm; dsg)) and Apostoli... |

18 | 1992]. \Fast linearspace computations of longest common subsequences - Apostolico, Browne, et al. |

14 |
A Linear Space Algorithm for Computing
- Hirschberg
- 1975
(Show Context)
Citation Context ...ruct a LCS we can save a pointer with each cell L[i; j] that indicates which of the terms L[i \Gamma 1; j \Gamma 1] + 1; L[i \Gamma 1; j]; L[i; j \Gamma 1] was used to define L[i; j]. It was shown in =-=[9]-=- that linear space suffices to compute both the length of a LCS and a LCS while maintaining the time bound. Example: Let \Sigma = fa; b; c; dg be an alphabet and let A = abcdbb and B = cbacbaaba be tw... |

12 | Improving the worst-case performance of the Hunt-Szymanski strategy for the longest common subsequence of two strings - Apostolico |

12 |
A fast algorithm for computing longest common subsequences of small alphabet size
- Chin, Poon
- 1990
(Show Context)
Citation Context ... an algorithm from Nakatsu et al. [17] that has running time O(n(m \Gamma p)) and that can be used if longest common subsequences of great length are expected. Other algorithms are from Chin and Poon =-=[5]-=- (O(ns + minfpm; dsg)) and Apostolico and Guerra [3] (O(m log n + d log( 2mn d ))). Here d denotes the number of minimal matches (see Section 2 for a definition). Algorithms that use linear space are ... |

7 |
An information-theoretic lower bound for the longest common subsequence problem
- Hirschberg
- 1978
(Show Context)
Citation Context ...Algorithms that use linear space are discussed in [9, 4, 14]. Table 1 gives a chronological survey of algorithms for the LCS Problem. Lower bounds on the complexity of the LCS Problem can be found in =-=[1, 10, 22]-=-. For a fixed alphabet of size s there is still a big gap between the linear lower bound\Omega\Gamma ns) and the worst case upper bounds of the various algorithms. 1.3 Mission statement As pointed out... |

5 | Performance Analysis of Some Simple Heuristics for Computing the Longest Common Subsequence, Algorithmica - Chin, Poon - 1994 |

5 |
Algorithms for Approximate String
- Ukkonen
- 1985
(Show Context)
Citation Context ...computed by the dynamic programming algorithm. Matches are encircled and regions with identical L i;j value are separated through contours. was first suggested independently by Myers [16] and Ukkonen =-=[20]-=-. It takes time O(n(n \Gamma p)) and was later improved by Wu et al. [23] to O(n(m \Gamma p)). Thus, these algorithms suit with a class of sequences where the length p of a LCS is expected to be long.... |

3 |
A Fast Algorithm for Computing
- Hunt, Szymanski
- 1977
(Show Context)
Citation Context ...l be reviewed in Section 2 since it is the basis for the new algorithms presented in this paper, too. The first algorithms using this approach have been invented by Hirschberg [11] and Hunt/Szymanski =-=[13]-=- with processing time O(pn) and O(m+r log p) respectively. An additional O(n log s) term has to be added for both methods for a standard preprocessing phase. Later, both algorithms have been refined b... |

1 |
Du: New Algorithms for the LCS
- Hsu, W
- 1984
(Show Context)
Citation Context ...ing time O(pn) and O(m+r log p) respectively. An additional O(n log s) term has to be added for both methods for a standard preprocessing phase. Later, both algorithms have been refined by Hsu and Du =-=[12]-=- and by Apostolico and Guerra [3] to take time O(pm log n p + pm) and O(pm log(minfs; m; 2n m g)) respectively. They perform well when a LCS is expected to be short. There is also an algorithm from Na... |

1 | Rangan: A Linear Space Algorithm for the LCS - Kumar, P - 1987 |

1 |
Myers: An O(ND) Difference Algorithm and Its Variations, Algorithmica
- W
- 1986
(Show Context)
Citation Context ...gure 1: L-Matrix computed by the dynamic programming algorithm. Matches are encircled and regions with identical L i;j value are separated through contours. was first suggested independently by Myers =-=[16]-=- and Ukkonen [20]. It takes time O(n(n \Gamma p)) and was later improved by Wu et al. [23] to O(n(m \Gamma p)). Thus, these algorithms suit with a class of sequences where the length p of a LCS is exp... |