Results 1 
4 of
4
Sequencing from compomers: Using mass spectrometry for DNA denovo sequencing of 200+ nt
 J. Comput. Biol
"... Abstract. One of the main endeavors in today’s Life Science remains the efficient sequencing of long DNA molecules. Today, most denovo sequencing of DNA is still performed using electrophoresisbased Sanger Sequencing, based on the Sanger concept of 1977. Methods using mass spectrometry to acquire ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
Abstract. One of the main endeavors in today’s Life Science remains the efficient sequencing of long DNA molecules. Today, most denovo sequencing of DNA is still performed using electrophoresisbased Sanger Sequencing, based on the Sanger concept of 1977. Methods using mass spectrometry to acquire the Sanger Sequencing data are limited by short sequencing lengths of 15–25 nt. We propose a new method for DNA sequencing using basespecific cleavage and mass spectrometry, that appears to be a promising alternative to classical DNA sequencing approaches. A single stranded DNA or RNA molecule is cleaved by a basespecific (bio)chemical reaction using, for example, RNAses. The cleavage reaction is modified such that not all, but only a certain percentage of those bases are cleaved. The resulting mixture of fragments is then analyzed using MALDITOF mass spectrometry, whereby we acquire the molecular masses of fragments. For every peak in the mass spectrum, we calculate those base compositions that will potentially create a peak of the observed mass and, repeating the cleavage reaction for all four bases, finally try to uniquely reconstruct the underlying sequence from these observed spectra. This leads us to the combinatorial problem of Sequencing From Compomers and, finally, to the graphtheoretical problem of finding a walk in a subgraph of the de Bruijn graph. Application of this method to simulated data indicates that it might be capable of sequencing DNA molecules with 200+ nt. 1
Algorithmic complexity of protein identification: Combinatorics of weighted strings
 DISCRETE APPLIED MATHEMATICS, SPECIAL ISSUE ON COMBINATORICS OF SEARCHING, SORTING, AND CODING. (2002)
, 2004
"... We investigate a problem from computational biology: Given a constant size alphabet M with a weight function / : M> +, find an efficient data structure and query algorithm solving the following problem: For a weight M C + and a string cr over A, decide whether cr contains a substring with weight M ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We investigate a problem from computational biology: Given a constant size alphabet M with a weight function / : M> +, find an efficient data structure and query algorithm solving the following problem: For a weight M C + and a string cr over A, decide whether cr contains a substring with weight M (ONE STRING MASS FINDING PROBLEM). If the answer is yes, then we may in addition require a witness, i.e. indices i _ i and ending at position j has weight M. We allow preprocessing of the string, and measure efficiency in two parameters: storage space required for the preprocessed data, and running time of the query algorithm for given M. We are interested in data structures and algorithms requiring subquadratic storage space and sublinear query time, where we measure the input size as the length of the input string. We present two efficient algorithms: LOOKUP solves the problem with O(,) space and (Wg ' loglog,) time; INTERVAL solves the problem for binary alphabets with O0, ) space in O(log,) time. We sketch a third algorithm, CLUSTER, which can be adjusted for a space time tradeoff but for which we do not yet have a resource analysis. We introduce a function on weighted strings which is closely related to the analysis of algorithms for the ONE STRING MASS FINDING PROBLEM: The number of different submasses of a weighted string. We present several properties of this function, including upper and lower bounds. Finally, we introduce two more general variants of the problem and sketch how algorithms may be extended for these variants.
Fast and Scalable Parallel Algorithms for KnapsackLike Problems
 Journal of Parallel and Distributed Computing
, 1996
"... We present two new algorithms for searching in sorted X+Y +R+S, one based on heaps and the other on sampling. Each of the algorithms runs in time O(n 2 logn) (n being the size of the sorted arrays X, Y , R and S). Hence in each case, by constructing arrays of size n = O(2 s=4 ), we obtain a new ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We present two new algorithms for searching in sorted X+Y +R+S, one based on heaps and the other on sampling. Each of the algorithms runs in time O(n 2 logn) (n being the size of the sorted arrays X, Y , R and S). Hence in each case, by constructing arrays of size n = O(2 s=4 ), we obtain a new algorithm for solving certain NPComplete problems such as Knapsack on s data items in time equal (up to a constant factor) to the best algorithm currently known. Each of the algorithms is capable of being efficiently implemented in parallel and so solving large instances of these NPComplete problems fast on coarsegrained distributed memory parallel computers. The parallel version of the heap based algorithm is communicationefficient and exhibits optimal speedup for a number of processors less than n using O(n) space in each one; the sampling based algorithm exhibits optimal speedup for any number of processors up to n using O(n) space in total provided that the architecture is capable of...
ISSN 09467831SEQUENCING FROM COMPOMERS: USING MASS SPECTROMETRY FOR DNA DENOVO SEQUENCING OF 200+ NT
"... Sequencing from compomers: Using mass spectrometry for DNA denovo sequencing of 200+ nt ..."
Abstract
 Add to MetaCart
Sequencing from compomers: Using mass spectrometry for DNA denovo sequencing of 200+ nt