Results 1  10
of
39
An Algorithm for Approximate Tandem Repeats
 In Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching (CPM), volume 684 of Lecture Notes in Computer Science
, 1993
"... A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g. abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g. abcdaacd. ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g. abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g. abcdaacd.
A Subquadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices
, 2002
"... The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in subquadratic time, for metrics which use a scoring ..."
Abstract

Cited by 56 (4 self)
 Add to MetaCart
The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in subquadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations. The speedup is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by LempelZiv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n 2 = log n) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually O(hn 2 = log n) where h 1 is the entropy of the text. Institut GaspardMonge, Universite de MarnelaVallee, Cite Descartes, ChampssurMarne, 77454 MarnelaVallee Cedex 2, France, email: mac@univmlv.fr. y Department of Computer Science, Haifa University, Haifa 31905, Israel, phone: (9724) 8240103, FAX: (9724) 8249331; Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, NY 112013840; email: landau@poly.edu; partially supported by NSF grant CCR0104307, by NATO Science Programme grant PST.CLG.977017, by the Israel Science Foundation (grants 173/98 and 282/01), by the FIRST Foundation of the Israel Academy of Science and Humanities, and by IBM Faculty Partnership Award. z Department of Computer Science, Haifa University, Haifa 31905, Israel; On Education Leave from the IBM T.J.W. Research Center; email: michal@cs.haifa.il; partially supported by by the Israel Science Foundation (grants 173/98 and 282/01), and by the FIRST Foundation of the Israel Academy of Science ...
Multiple source shortest paths in a genus g graph
 Proc. 18th Ann. ACMSIAM Symp. Discrete Algorithms
"... We give an O(g2n log n) algorithm to represent the shortest path tree from all the vertices on a single specified face f in a genus g graph. From this representation, any query distance from a vertex in f can be obtained in O(log n) time. The algorithm uses a kinetic data structure, where the source ..."
Abstract

Cited by 27 (10 self)
 Add to MetaCart
We give an O(g2n log n) algorithm to represent the shortest path tree from all the vertices on a single specified face f in a genus g graph. From this representation, any query distance from a vertex in f can be obtained in O(log n) time. The algorithm uses a kinetic data structure, where the source of the tree iteratively movesacrossedgesinf. In addition, we give applications using these shortest path trees in order to compute the shortest noncontractible cycle and the shortest nonseparating cycle embedded on an orientable 2manifold in O(g3n log n) time. 1
On the Common Substring Alignment Problem
"... The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings and a target string. is a common substring of all strings, that is. The goal is to compute the similarity of all strings with, without computing the part of again and again. Using the classical dynamic p ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings and a target string. is a common substring of all strings, that is. The goal is to compute the similarity of all strings with, without computing the part of again and again. Using the classical dynamic programming tables, each appearance of in a source string would require the computation of all the values in a dynamic programming table of size where is the size of. Here we describe an algorithm which is composed of an encoding stage and an alignment stage. During the first stage, a data structure is constructed which encodes the comparison of with. Then, during the alignment stage, for each comparison of a source with, the precompiled data structure is used to speed up the part of. We show how to reduce the alignment work, for each appearance of the common substring in a source string, to at the cost of encoding work, which is executed only once.
Sequence Alignment with Tandem Duplication
 J. Comp. Biol
, 1997
"... Algorithm development for comparing and aligning biological sequences has, until recently, been based on the SI model of mutational events which assumes that modi#cation of sequences proceeds through any of the operations of substitution, insertion or deletion #the latter two collectively termed i ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
Algorithm development for comparing and aligning biological sequences has, until recently, been based on the SI model of mutational events which assumes that modi#cation of sequences proceeds through any of the operations of substitution, insertion or deletion #the latter two collectively termed indels#.
All semilocal longest common subsequences in subquadratic time
 In Proceedings of CSR
, 2006
"... subquadratic time ..."
Multiplesource shortest paths in embedded graphs
, 2012
"... Let G be a directed graph with n vertices and nonnegative weights in its directed edges, embedded on a surface of genus g, and let f be an arbitrary face of G. We describe an algorithm to preprocess the graph in O(gn log n) time, so that the shortestpath distance from any vertex on the boundary of ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
Let G be a directed graph with n vertices and nonnegative weights in its directed edges, embedded on a surface of genus g, and let f be an arbitrary face of G. We describe an algorithm to preprocess the graph in O(gn log n) time, so that the shortestpath distance from any vertex on the boundary of f to any other vertex in G can be retrieved in O(log n) time. Our result directly generalizes the O(n log n)time algorithm of Klein [Multiplesource shortest paths in planar graphs. In Proc. 16th Ann. ACMSIAM Symp. Discrete Algorithms, 2005] for multiplesource shortest paths in planar graphs. Intuitively, our preprocessing algorithm maintains a shortestpath tree as its source point moves continuously around the boundary of f. As an application of our algorithm, we describe algorithms to compute a shortest noncontractible or nonseparating cycle in embedded, undirected graphs in O(g² n log n) time.
Approximate Periods of Strings
 In Proc. Tenth Combinatorial Pattern Matching Conference, Lecture Notes in Computer Science 1645
"... The study of approximately periodic strings is relevant to diverse applications such as molecular biology, data compression, and computerassisted music analysis. Here we study dierent forms of approximate periodicity under a variety of distance functions. We consider three related problems, for two ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
The study of approximately periodic strings is relevant to diverse applications such as molecular biology, data compression, and computerassisted music analysis. Here we study dierent forms of approximate periodicity under a variety of distance functions. We consider three related problems, for two of which we derive polynomialtime algorithms; we then show that the third problem is NPcomplete. Key words: periodicity, approximate periods, repetitions, distance function 1
On the Distribution of KTuple Matches for Sequence Homology: A Constant Time Exact Calculation of the Variance
 J. Comp. Bio
, 1997
"... We study the distribution of a statistic useful in calculating the signi#cance of the number of ktuple matches detected in biological sequence homology algorithms. The statistic is R n;k , the total number of heads in head runs of length k or more in a sequence of iid Bernoulli trials of length n. ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We study the distribution of a statistic useful in calculating the signi#cance of the number of ktuple matches detected in biological sequence homology algorithms. The statistic is R n;k , the total number of heads in head runs of length k or more in a sequence of iid Bernoulli trials of length n. Calculation of the mean is straightforward. Poisson approximation formulas have been used for the variance because they are simple and powerful. Unfortunately, when p = P #Head# is large, the Poisson approximation no longer works well. In our application, p is large, say :75, and wehave turned instead to direct calculation of the variance. Surprisingly,we are able to show that the variance, which is based on the interactions of O#n 2 # random variables, can be computed in constant time, independent of the length of the sequence and probability p. This result can be used to calculate the mean and variance of a number of other head run statistics in constant time. Additionally,we showhow to extend the result to sequences generated bya stationary Markov process where the variance can be calculated in O#n# time. 1
ReUse Dynamic Programming for Sequence Alignment: An Algorithmic Toolkit
 STRING ALGORITHMICS, UNITED KINGDOM
, 2005
"... ..."