MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

An O(ND) Difference Algorithm and Its Variations (1986) [100 citations — 3 self]

by Eugene W. Myers
Algorithmica
Add To MetaCart

Abstract:

The problems of finding a longest common subsequence of two sequences A and B and a shortest edit script for transforming A into B have long been known to be dual problems. In this paper, they are shown to be equivalent to finding a shortest/longest path in an edit graph. Using this perspective, a simple O(ND) time and space algorithm is developed where N is the sum of the lengths of A and B and D is the size of the minimum edit script for A and B. The algorithm performs well when differences are small (sequences are similar) and is consequently fast in typical applications. The algorithm is shown to have O(N +D expected-time performance under a basic stochastic model. A refinement of the algorithm requires only O(N) space, and the use of suffix trees leads to an O(NlgN +D ) time variation.

Citations

598 Data Structures and Algorithms – Aho, Hopcroft, et al. - 1987
437 The string-to-string correction problem – Wagner, Fisher - 1974
429 A space-economical suffix tree construction algorithm – McCreight - 1976
298 Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparisons – Sankoff, Kruskal - 1983
290 The Art of Computer Programming, Vol.3: Sorting and Searching – Knuth - 1973
244 The Source Code Control System – Rochkind - 1975
228 Fast Algorithms for Finding Nearest Common Ancestors – Harel, Tarjan - 1984
177 DS: A Linear space algorithm for computing maximal Common Subsequences – Hirschberg - 1975
121 A faster algorithm computing string edit distances – Masek, Paterson - 1980
109 Algorithms for the longest common subsequence problem – Hirschberg - 1977
107 A fast algorithm for computing longest common subsequences – Hunt, Szymansky
91 note on two – Dijkstra - 1959
66 The string-to-string correction problem with block move – TICHY - 1984
65 An algorithm for differential file comparison,” Computer Science – Hunt, McIllroy
43 Bounds on the complexity of the longest common subsequence problem – Aho, Hirschberg, et al. - 1976
18 A Longest common subsequence algorithm suitable for similar test strings – Nakatsu, Kambayashi, et al. - 1982
7 A redisplay algorithm – GOSLING - 1981
6 An information-theoretic lower bound for the longest common subsequence problem – Hirschberg - 1978