Results 1 
4 of
4
Incremental String Comparison
 SIAM JOURNAL ON COMPUTING
, 1995
"... The problem of comparing two sequences A and B to determine their LCS or the edit distance between them has been much studied. In this paper we consider the following incremental version of these problems: given an appropriate encoding of a comparison between A and B, can one incrementally compute t ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
The problem of comparing two sequences A and B to determine their LCS or the edit distance between them has been much studied. In this paper we consider the following incremental version of these problems: given an appropriate encoding of a comparison between A and B, can one incrementally compute the answer for A and bB, and the answer for A and Bb with equal efficiency, where b is an additional symbol? Our main result is a theorem exposing a surprising relationship between the dynamic programming solutions for two such "adjacent" problems. Given a threshold k on the number of differences to be permitted in an alignment, the theorem leads directly to an O(k) algorithm for incrementally computing a new solution from an old one, as contrasts the O(k²) time required to compute a solution from scratch. We further show with a series of applications that this algorithm is indeed more powerful than its nonincremental counterpart by solving the applications with greater asymptotic ef...
Toward Simplifying and Accurately Formulating Fragment Assembly
 JOURNAL OF COMPUTATIONAL BIOLOGY
, 1995
"... The fragment assembly problem is that of reconstructing a DNA sequence from a collection of randomly sampled fragments. Traditionally the objective of this problem has been to produce the shortest string that contains all the fragments as substrings, but in the case of repetitive target sequence ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
The fragment assembly problem is that of reconstructing a DNA sequence from a collection of randomly sampled fragments. Traditionally the objective of this problem has been to produce the shortest string that contains all the fragments as substrings, but in the case of repetitive target sequences this objective produces answers that are overcompressed. In this paper, the problem is reformulated as one of finding a maximumlikelihood reconstruction with respect to the 2sided KolmogorovSmirnov statistic, and it is argued that this is a better formulation of the problem. Next the fragment assembly problem is recast in graphtheoretic terms as one of finding a noncyclic subgraph with certain properties and the objectives of being shortest or maximallylikely are also recast in this framework. Finally, a series of graph reduction transformations are given that dramatically reduce the size of the graph to be explored in practical instances of the problem. This reduction is ...
TrieBased Data Structures for Sequence Assembly (Extended Abstract)
 The Eighth Symposium on Combinatorial Pattern Matching
, 1997
"... Ting Chen Steven S. Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 ftichenskienag@cs.sunysb.edu January 27, 1997 1 ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Ting Chen Steven S. Skiena Department of Computer Science State University of New York Stony Brook, NY 117944400 ftichenskienag@cs.sunysb.edu January 27, 1997 1
How Good is GenomeLevel Fragment Assembly? (Extended Abstract)
, 1997
"... ) Ting Chen Steven S. Skiena y Department of Computer Science State University of New York Stony Brook, NY 117944400 ftichenjskienag@cs.sunysb.edu October 17, 1997 1 Introduction In late Summer 1997, groups at Brookhaven National Laboratory (BNL) and the Institute for Genome Research (TIGR) in ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
) Ting Chen Steven S. Skiena y Department of Computer Science State University of New York Stony Brook, NY 117944400 ftichenjskienag@cs.sunysb.edu October 17, 1997 1 Introduction In late Summer 1997, groups at Brookhaven National Laboratory (BNL) and the Institute for Genome Research (TIGR) independently completed sequencing the genome of Borrelia burgdorferi, the bacterium which causes Lyme disease. As part of the Brookhaven team, lead by Dr. William Studier, we have developed a new fragment assembler, STROLL, which is capable of assembling megabase genome sequencing projects. Why did we develop yet another fragment assembler? At the time of our beginning this project (January 1996), the Brookhaven group did not have access to an adequate assembler for assembling data using their primer walking strategy [6]. Indeed, historically, fragment assemblers did not prove very portable across different sequencing projects. Each large sequencing team developed its own sequencing strategy...