Results 1  10
of
49
Identifying Syntactic Differences Between Two Programs
 Software  Practice and Experience
, 1991
"... this paper is organized into five sections, as follows. The internal form of a program, which is a variant of a parse tree, is discussed in the next section. Then the treematching algorithm and the synchronous prettyprinting technique are described. Experience with the comparator for the C languag ..."
Abstract

Cited by 115 (0 self)
 Add to MetaCart
this paper is organized into five sections, as follows. The internal form of a program, which is a variant of a parse tree, is discussed in the next section. Then the treematching algorithm and the synchronous prettyprinting technique are described. Experience with the comparator for the C language and some performance measurements are also presented. The last section discusses related work and concludes this paper
A Subquadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices
, 2002
"... The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in subquadratic time, for metrics which use a scoring ..."
Abstract

Cited by 74 (4 self)
 Add to MetaCart
(Show Context)
The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in subquadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations. The speedup is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by LempelZiv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n 2 = log n) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually O(hn 2 = log n) where h 1 is the entropy of the text. Institut GaspardMonge, Universite de MarnelaVallee, Cite Descartes, ChampssurMarne, 77454 MarnelaVallee Cedex 2, France, email: mac@univmlv.fr. y Department of Computer Science, Haifa University, Haifa 31905, Israel, phone: (9724) 8240103, FAX: (9724) 8249331; Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, NY 112013840; email: landau@poly.edu; partially supported by NSF grant CCR0104307, by NATO Science Programme grant PST.CLG.977017, by the Israel Science Foundation (grants 173/98 and 282/01), by the FIRST Foundation of the Israel Academy of Science and Humanities, and by IBM Faculty Partnership Award. z Department of Computer Science, Haifa University, Haifa 31905, Israel; On Education Leave from the IBM T.J.W. Research Center; email: michal@cs.haifa.il; partially supported by by the Israel Science Foundation (grants 173/98 and 282/01), and by the FIRST Foundation of the Israel Academy of Science ...
Reconstructing a History of Recombinations From a Set of Sequences
 Discrete Appl. Math
, 1998
"... One of the classic problems in computational biology is the reconstruction of evolutionary history. A recent trend in the area is to increase the explanatory power of the models that are considered by incorporating higherorder evolutionary events that more accurately reflect the mechanisms of mutat ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
(Show Context)
One of the classic problems in computational biology is the reconstruction of evolutionary history. A recent trend in the area is to increase the explanatory power of the models that are considered by incorporating higherorder evolutionary events that more accurately reflect the mechanisms of mutation at the level of the chromosome. We take a step in this direction by considering the problem of reconstructing an evolutionary history for a set of genetic sequences that have evolved by recombination. Recombination is a nontreelike event that produces a child sequence by crossing two parent sequences. We present polynomialtime algorithms for reconstructing a parsimonious history of such events for several models of recombination when all sequences, including those of ancestors, are present in the input. We also show that these models appear to be near the limit of what can be solved in polynomial time, in that several natural generalizations are NPcomplete. Keywords Computational bio...
Sequence Comparison with Mixed Convex and Concave Costs
, 1989
"... Recently a number of algorithms have been developed for solving the minimumweight edit sequence problem with nonlinear costs for multiple insertions and deletions. We extend these algorithms to cost functions that are neither convex nor concave, but a mixture of both. We also apply this techniqu ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
Recently a number of algorithms have been developed for solving the minimumweight edit sequence problem with nonlinear costs for multiple insertions and deletions. We extend these algorithms to cost functions that are neither convex nor concave, but a mixture of both. We also apply this technique
On the Common Substring Alignment Problem
"... The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings and a target string. is a common substring of all strings, that is. The goal is to compute the similarity of all strings with, without computing the part of again and again. Using the classical dynamic p ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings and a target string. is a common substring of all strings, that is. The goal is to compute the similarity of all strings with, without computing the part of again and again. Using the classical dynamic programming tables, each appearance of in a source string would require the computation of all the values in a dynamic programming table of size where is the size of. Here we describe an algorithm which is composed of an encoding stage and an alignment stage. During the first stage, a data structure is constructed which encodes the comparison of with. Then, during the alignment stage, for each comparison of a source with, the precompiled data structure is used to speed up the part of. We show how to reduce the alignment work, for each appearance of the common substring in a source string, to at the cost of encoding work, which is executed only once.
Cacheoblivious dynamic programming
 In Proc. of the Seventeenth Annual ACMSIAM Symposium on Discrete Algorithms, SODA ’06
, 2006
"... We present efficient cacheoblivious algorithms for several fundamental dynamic programs. These include new algorithms with improved cache performance for longest common subsequence (LCS), edit distance, gap (i.e., edit distance with gaps), and least weight subsequence. We present a new cacheoblivi ..."
Abstract

Cited by 22 (5 self)
 Add to MetaCart
(Show Context)
We present efficient cacheoblivious algorithms for several fundamental dynamic programs. These include new algorithms with improved cache performance for longest common subsequence (LCS), edit distance, gap (i.e., edit distance with gaps), and least weight subsequence. We present a new cacheoblivious framework called the Gaussian Elimination Paradigm (GEP) for Gaussian elimination without pivoting that also gives cacheoblivious algorithms for FloydWarshall allpairs shortest paths in graphs and ‘simple DP’, among other problems. 1
Speeding up Dynamic Programming
 In Proc. 29th Symp. Foundations of Computer Science
, 1988
"... this paper we consider the problem of computing two similar recurrences: the onedimensional case ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
this paper we consider the problem of computing two similar recurrences: the onedimensional case
Parallel Dynamic Programming
, 1992
"... We study the parallel computation of dynamic programming. We consider four important dynamic programming problems which have wide application, and that have been studied extensively in sequential computation: (1) the 1D problem, (2) the gap problem, (3) the parenthesis problem, and (4) the RNA probl ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
(Show Context)
We study the parallel computation of dynamic programming. We consider four important dynamic programming problems which have wide application, and that have been studied extensively in sequential computation: (1) the 1D problem, (2) the gap problem, (3) the parenthesis problem, and (4) the RNA problem. The parenthesis problem has fast parallel algorithms; almost no work has been done for parallelizing the other three. We present a unifying framework for the parallel computation of dynamic programming. We use two wellknown methods, the closure method and the matrix product method, as general paradigms for developing parallel algorithms. Combined with various techniques, they lead to a number of new results. Our main results are optimal sublineartime algorithms for the 1D, parenthesis, and RNA problems.
Aligning Alignments
 In 9th Annual Symp. Combinatorial Pattern Matching, Volume 1448 of LNCS
, 1998
"... While the area of sequence comparison has a rich collection of results on the alignment of two sequences, and even the alignment of multiple sequences, there is little known about the alignment of two alignments. The problem becomes interesting when the alignment objective function counts gaps, as i ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
While the area of sequence comparison has a rich collection of results on the alignment of two sequences, and even the alignment of multiple sequences, there is little known about the alignment of two alignments. The problem becomes interesting when the alignment objective function counts gaps, as is common when aligning biological sequences, and has the form of the sumofpairs objective. We begin a thorough investigation of aligning two alignments under the sumofpairs objective with general linear gap costs when either of the two alignments are given in the form of a sequence (a degenerate alignment containing a single sequence) , a multiple alignment (containing two or more sequences), or a profile (a representation of a multiple alignment often used in computational biology). This leads to five problem variations, some of which arise in widelyused heuristics for multiple sequence alignment, and in assessing the relatedness of a sequence to a sequence family. For variations in w...
A Study of Accessible Motifs and RNA Folding Complexity
"... Abstract. mRNA molecules are folded in the cells and therefore many of their substrings may actually be inaccessible to protein and microRNA binding. The need to apply an accessability criterion to the task of genomewide mRNA motif discovery raises the challenge of overcoming the core O(n 3) factor ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
(Show Context)
Abstract. mRNA molecules are folded in the cells and therefore many of their substrings may actually be inaccessible to protein and microRNA binding. The need to apply an accessability criterion to the task of genomewide mRNA motif discovery raises the challenge of overcoming the core O(n 3) factor imposed by the time complexity of the currently best known algorithms for RNA secondary structure prediction [24, 25, 43]. We speed up the dynamic programming algorithms that are standard for RNA folding prediction. Our new approach significantly reduces the computations without sacrificing the optimality of the results, yielding an expected time complexity of O(n 2 ψ(n)), whereψ(n) is shown to be constant on average under standard polymer folding models. Benchmark analysis confirms that in practice the runtime ratio between the previous approach and the new algorithm indeed grows linearly with increasing sequence size. The fast new RNA folding algorithm is utilized for genomewide discovery of accessible cisregulatory motifs in data sets of ribosomal densities and decay rates of S. cerevisiae genes and to the mining of exposed binding sites of tissuespecific microRNAs in A. Thaliana. Further details, including additional figures and proofs to all lemmas, can be found at: