Results 1 - 10
of
19
A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices
, 2002
"... The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations. The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n 2 = log n) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually O(hn 2 = log n) where h 1 is the entropy of the text. Institut Gaspard-Monge, Universite de Marne-la-Vallee, Cite Descartes, Champs-surMarne, 77454 Marne-la-Vallee Cedex 2, France, email: mac@univ-mlv.fr. y Department of Computer Science, Haifa University, Haifa 31905, Israel, phone: (972-4) 824-0103, FAX: (972-4) 824-9331; Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, NY 11201-3840; email: landau@poly.edu; partially supported by NSF grant CCR-0104307, by NATO Science Programme grant PST.CLG.977017, by the Israel Science Foundation (grants 173/98 and 282/01), by the FIRST Foundation of the Israel Academy of Science and Humanities, and by IBM Faculty Partnership Award. z Department of Computer Science, Haifa University, Haifa 31905, Israel; On Education Leave from the IBM T.J.W. Research Center; email: michal@cs.haifa.il; partially supported by by the Israel Science Foundation (grants 173/98 and 282/01), and by the FIRST Foundation of the Israel Academy of Science ...
Perspectives of Monge Properties in Optimization
, 1995
"... An m × n matrix C is called Monge matrix if c ij + c rs c is + c rj for all 1 i ! r m, 1 j ! s n. In this paper we present a survey on Monge matrices and related Monge properties and their role in combinatorial optimization. Specifically, we deal with the following three main topics: (i) funda ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
An m × n matrix C is called Monge matrix if c ij + c rs c is + c rj for all 1 i ! r m, 1 j ! s n. In this paper we present a survey on Monge matrices and related Monge properties and their role in combinatorial optimization. Specifically, we deal with the following three main topics: (i) fundamental combinatorial properties of Monge structures, (ii) applications of Monge properties to optimization problems and (iii) recognition of Monge properties.
Speeding up Dynamic Programming
- In Proc. 29th Symp. Foundations of Computer Science
, 1988
"... this paper we consider the problem of computing two similar recurrences: the one-dimensional case ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
this paper we consider the problem of computing two similar recurrences: the one-dimensional case
On the Common Substring Alignment Problem
"... The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings and a target string. is a common substring of all strings, that is. The goal is to compute the similarity of all strings with, without computing the part of again and again. Using the classical dynamic p ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
The Common Substring Alignment Problem is defined as follows: Given a set of one or more strings and a target string. is a common substring of all strings, that is. The goal is to compute the similarity of all strings with, without computing the part of again and again. Using the classical dynamic programming tables, each appearance of in a source string would require the computation of all the values in a dynamic programming table of size where is the size of. Here we describe an algorithm which is composed of an encoding stage and an alignment stage. During the first stage, a data structure is constructed which encodes the comparison of with. Then, during the alignment stage, for each comparison of a source with, the pre-compiled data structure is used to speed up the part of. We show how to reduce the alignment work, for each appearance of the common substring in a source string, to- at the cost of encoding work, which is executed only once.
Parallel Dynamic Programming
, 1992
"... We study the parallel computation of dynamic programming. We consider four important dynamic programming problems which have wide application, and that have been studied extensively in sequential computation: (1) the 1D problem, (2) the gap problem, (3) the parenthesis problem, and (4) the RNA probl ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
We study the parallel computation of dynamic programming. We consider four important dynamic programming problems which have wide application, and that have been studied extensively in sequential computation: (1) the 1D problem, (2) the gap problem, (3) the parenthesis problem, and (4) the RNA problem. The parenthesis problem has fast parallel algorithms; almost no work has been done for parallelizing the other three. We present a unifying framework for the parallel computation of dynamic programming. We use two well-known methods, the closure method and the matrix product method, as general paradigms for developing parallel algorithms. Combined with various techniques, they lead to a number of new results. Our main results are optimal sublinear-time algorithms for the 1D, parenthesis, and RNA problems.
Cache-oblivious dynamic programming
- In Proc. of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’06
, 2006
"... We present efficient cache-oblivious algorithms for several fundamental dynamic programs. These include new algorithms with improved cache performance for longest common subsequence (LCS), edit distance, gap (i.e., edit distance with gaps), and least weight subsequence. We present a new cache-oblivi ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
We present efficient cache-oblivious algorithms for several fundamental dynamic programs. These include new algorithms with improved cache performance for longest common subsequence (LCS), edit distance, gap (i.e., edit distance with gaps), and least weight subsequence. We present a new cache-oblivious framework called the Gaussian Elimination Paradigm (GEP) for Gaussian elimination without pivoting that also gives cache-oblivious algorithms for Floyd-Warshall all-pairs shortest paths in graphs and ‘simple DP’, among other problems. 1
Linear and O(n log n) Time Minimum-Cost Matching Algorithms for Quasi-convex Tours (Extended Abstract)
"... Samuel R. Buss # Peter N. Yianilos + Abstract Let G be a complete, weighted, undirected, bipartite graph with n red nodes, n # blue nodes, and symmetric cost function c(x, y) . A maximum matching for G consists of min{n, n # edges from distinct red nodes to distinct blue nodes. Our objective is ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
Samuel R. Buss # Peter N. Yianilos + Abstract Let G be a complete, weighted, undirected, bipartite graph with n red nodes, n # blue nodes, and symmetric cost function c(x, y) . A maximum matching for G consists of min{n, n # edges from distinct red nodes to distinct blue nodes. Our objective is to find a minimum-cost maximum matching, i.e. one for which the sum of the edge costs has minimal value. This is the weighted bipartite matching problem; or as it is sometimes called, the assignment problem.
Evaluating Techniques for Generating Metric-Based Classification Trees
- In Journal of Systems and Software
, 1990
"... Metric-based classification trees provide an approach for identifying user-specified classes of high-risk software components throughout the software lifecycle. Based on measurable attributes of software components and processes, this empirically guided approach derives models of problematic softwa ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Metric-based classification trees provide an approach for identifying user-specified classes of high-risk software components throughout the software lifecycle. Based on measurable attributes of software components and processes, this empirically guided approach derives models of problematic software components. These models, which are represented as classification trees, are used on future systems to identify components likely to share the same high-risk properties. Example high-risk component properties include being fault-prone, change-prone, or effort-prone, or containing certain types of faults. Identifying these components allows developers to focus the application of specialized techniques and tools for analyzing, testing, and constructing software. A validation study using metric data from 16 NASA systems showed that the trees had an average classification accuracy of 79.3 percent for fault-prone and effort-prone components in that environment. One fundamental feature of the ...
A Generic Program for Sequential Decision Processes
- Programming Languages: Implementations, Logics, and Programs
, 1995
"... This paper is an attempt to persuade you of my viewpoint by presenting a novel generic program for a certain class of optimisation problems, named sequential decision processes. This class was originally identified by Richard Bellman in his pioneering work on dynamic programming [4]. It is a perfect ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
This paper is an attempt to persuade you of my viewpoint by presenting a novel generic program for a certain class of optimisation problems, named sequential decision processes. This class was originally identified by Richard Bellman in his pioneering work on dynamic programming [4]. It is a perfect example of a class of problems which are very much alike, but which has until now escaped solution by a single program. Those readers who have followed some of the work that Richard Bird and I have been doing over the last five years [6, 7] will recognise many individual examples: all of these have now been unified. The point of this observation is that even when you are on the lookout for generic programs, it can take a rather long time to discover them. The presentation below will follow that earlier work, by referring to the calculus of relations and the relational theory of data types. I shall however attempt to be light on the formalism, as I do not regard it as essential to the main thesis of this paper. Undoubtedly there are other (perhaps more convenient) notations in which the same ideas could be developed. This paper does assume some degree of familiarity with a lazy functional programming language such as Haskell, Hope, Miranda
Constructing Huffman Trees in Parallel
- SIAM Journal on Computing
, 1995
"... We present a parallel algorithm for the Huffman Coding problem. We reduce the Huffman Coding problem to the Concave Least Weight Subsequence problem and give a parallel algorithm that solves the latter problem in O( p n log n) time with n processors on a CREW PRAM. This leads to the first sublinea ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We present a parallel algorithm for the Huffman Coding problem. We reduce the Huffman Coding problem to the Concave Least Weight Subsequence problem and give a parallel algorithm that solves the latter problem in O( p n log n) time with n processors on a CREW PRAM. This leads to the first sublinear time o(n 2 )-total work parallel algorithm for Huffman Coding. This reduction of the Huffman Coding problem to the Concave Least Weight Subsequence problem also yields an alternative O(n log n)-time (or linear time -- for a sorted input sequence) algorithm for Huffman Coding. This research was supported by NSF grant CCR9112067. y Part of this work was done while the author was visiting the University of California, Riverside. 1 Introduction Throughout this paper, a tree is a regular binary tree (i.e. a binary tree in which each internal node has two children). The level of a node in a tree is its distance from the root. The problem of constructing a Huffman tree is, given a seque...

