Results 1 
3 of
3
Combinatorial algorithms for DNA sequence assembly
 Algorithmica
, 1993
"... The trend towards very large DNA sequencing projects, such as those being undertaken as part of the human genome initiative, necessitates the development of efficient and precise algorithms for assembling a long DNA sequence from the fragments obtained by shotgun sequencing or other methods. The seq ..."
Abstract

Cited by 42 (3 self)
 Add to MetaCart
The trend towards very large DNA sequencing projects, such as those being undertaken as part of the human genome initiative, necessitates the development of efficient and precise algorithms for assembling a long DNA sequence from the fragments obtained by shotgun sequencing or other methods. The sequence reconstruction problem that we take as our formulation of DNA sequence assembly is a variation of the shortest common superstring problem, complicated by the presence of sequencing errors and reverse complements of fragments. Since the simpler superstring problem is NPhard, any efficient reconstruction procedure must resort to heuristics. In this paper, however, a four phase approach based on rigorous design criteria is presented, and has been found to be very accurate in practice. Our method is robust in the sense that it can accommodate high sequencing error rates and list a series of alternate solutions in the event that several appear equally good. Moreover it uses a limited form ...
Approximation Algorithm for the Shortest Approximate Common Superstring Problem
 International Journal of Computer Systems Science and Engineering
"... The Shortest Approximate Common Superstring (SACS) problem is: Given a set of strings f={w 1, w 2, … , w n}, where no w i is an approximate substring of w j, i ≠ j, find a shortest string S a, such that, every string of f is an approximate substring of S a. When the number of the strings n>2, th ..."
Abstract
 Add to MetaCart
The Shortest Approximate Common Superstring (SACS) problem is: Given a set of strings f={w 1, w 2, … , w n}, where no w i is an approximate substring of w j, i ≠ j, find a shortest string S a, such that, every string of f is an approximate substring of S a. When the number of the strings n>2, the SACS problem becomes NPcomplete. In this paper, we present a greedy approximation SACS algorithm. Our algorithm is a 1/2approximation for the SACS problem. It is of complexity O(n2*(l2+log(n))) in computing time, where n is the number of the strings and l is the length of a string. Our SACS algorithm is based on computation of the Length of the Approximate Longest Overlap (LALO). Keywords—Shortest approximate common superstring, approximation algorithms, strings overlaps, complexities.
Suffix Trees
"... ee was given by Weiner [?], although he called his tree a position tree. A different more space efficient algorithm to build a suffix tree in linear time was given a few years later by McCreight [?]. The algorithms are quite practical and allow very efficient solutions to many complex string problem ..."
Abstract
 Add to MetaCart
ee was given by Weiner [?], although he called his tree a position tree. A different more space efficient algorithm to build a suffix tree in linear time was given a few years later by McCreight [?]. The algorithms are quite practical and allow very efficient solutions to many complex string problems. Basic definitions In this discussion we will first show how to create the suffix tree for a single string S and later extend the discussion to sets of strings (as in the dictionary problem). The suffix of S starting in position i will be denoted Suff i . So for example, Suff 1 = S. Definition: A suffix tree T for an mcharacter string S is a rooted directed tree with exactly m leaves numbered 1 to m. We use r to designate the root. Ea