Results 1  10
of
21
Computing similarity between rna strings
, 1996
"... Ribonucleic acid (RNA) strings are strings over the fourletter alphabet {A, C, G, U} with a secondary structure of basepairing between A U and C G pairs in the string 1. Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed t ..."
Abstract

Cited by 38 (4 self)
 Add to MetaCart
Ribonucleic acid (RNA) strings are strings over the fourletter alphabet {A, C, G, U} with a secondary structure of basepairing between A U and C G pairs in the string 1. Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing. The noncrossing basepairing naturally leads to a treelike representation of the secondary structure of RNA strings. In this paper, we address several notions of similarity between two RNA strings that take into account both the primary sequence and secondary basepalring structure of the strings. We present efficient algorithms for exact matching and approximate matching between two RNA strings. We define a notion of alignment between two RNA strings and devise algorithms based on dynamic programming. We then present a method for optimally aligning a given RNA string with unknown secondary structure to one with known sequence and structure, thus attacking the structure prediction problem in the case when the structure of a closely related sequence is known. The techniques employed to prove our results include reductions to wellknown string matching problems allowing wild cards and ranges, and speeding up dynamic programming by using the tree structures implicit in the secondary structure of RNA strings.
A Generic Program for Sequential Decision Processes
 Programming Languages: Implementations, Logics, and Programs
, 1995
"... This paper is an attempt to persuade you of my viewpoint by presenting a novel generic program for a certain class of optimisation problems, named sequential decision processes. This class was originally identified by Richard Bellman in his pioneering work on dynamic programming [4]. It is a perfect ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
This paper is an attempt to persuade you of my viewpoint by presenting a novel generic program for a certain class of optimisation problems, named sequential decision processes. This class was originally identified by Richard Bellman in his pioneering work on dynamic programming [4]. It is a perfect example of a class of problems which are very much alike, but which has until now escaped solution by a single program. Those readers who have followed some of the work that Richard Bird and I have been doing over the last five years [6, 7] will recognise many individual examples: all of these have now been unified. The point of this observation is that even when you are on the lookout for generic programs, it can take a rather long time to discover them. The presentation below will follow that earlier work, by referring to the calculus of relations and the relational theory of data types. I shall however attempt to be light on the formalism, as I do not regard it as essential to the main thesis of this paper. Undoubtedly there are other (perhaps more convenient) notations in which the same ideas could be developed. This paper does assume some degree of familiarity with a lazy functional programming language such as Haskell, Hope, Miranda
LinearSpace Algorithms that Build Local Alignments from Fragments
 Algorithmica
, 1995
"... Abstract. This paper presents practical algorithms for building an alignment of two long sequences from a collection of "alignment fragments, " such as all occurrences of identical 5tuples in each of two DNA sequences. We first combine a timeefficient algorithm developed by Galil ..."
Abstract

Cited by 12 (7 self)
 Add to MetaCart
Abstract. This paper presents practical algorithms for building an alignment of two long sequences from a collection of &quot;alignment fragments, &quot; such as all occurrences of identical 5tuples in each of two DNA sequences. We first combine a timeefficient algorithm developed by Galil and coworkers with a spacesaving approach of Hirschberg to obtain a local alignment algorithm that uses O((M + N + F log N) log M) time and O(M + N) space to align sequences of lengths M and N from a pool of F alignment fragments. Ideas of Huang and Miller are then employed to develop a time and spaceefficient algorithm that computes n best nonintersecting alignments for any n> 1. An example illustrates the utility of these methods.
Approximate Regular Expression Pattern Matching with Concave Gap Penalties
 ALGORITHMICA
, 1992
"... Given a sequence A of length M and a regular expression R of length P , an approximate regular expression pattern matching algorithm computes the score of the optimal alignment between A and one of the sequences B exactly matched by R. An alignment between sequences A = a 1 a 2 : : : aM and B = b ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Given a sequence A of length M and a regular expression R of length P , an approximate regular expression pattern matching algorithm computes the score of the optimal alignment between A and one of the sequences B exactly matched by R. An alignment between sequences A = a 1 a 2 : : : aM and B = b 1 b 2 : : : b N is a list of ordered pairs, !(i 1 ; j 1 ); (i 2 ; j 2 ); : : : (i t ; j t )? such that i k ! i k+1 and j k ! j k+1 . In this case, the alignment aligns symbols a i k and b jk , and leaves blocks of unaligned symbols, or gaps, between them. A scoring scheme S associates costs for each aligned symbol pair and each gap. The alignment's score is the sum of the associated costs, and an optimal alignment is one of minimal score. There are a variety of schemes for scoring alignments. In a concave gappenalty scoring scheme S = fffi; wg, a function ffi (a; b) gives the score of each aligned pair of symbols a and b, and a concave function w(k) gives the score of a gap of lengt...
Longest common subsequence from fragments via sparse dynamic programming
 Proc. 6th Eur. Symp. Algorithm
, 1998
"... ..."
(Show Context)
Review of automatic document formatting
 In Proceedings of the 9th ACM symposium on Document engineering
, 2009
"... We review the literature on automatic document formatting with an emphasis on recent work in the field. One common way to frame document formatting is as a constrained optimization problem where decision variables encode element placement, constraints enforce required geometric relationships, and th ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
We review the literature on automatic document formatting with an emphasis on recent work in the field. One common way to frame document formatting is as a constrained optimization problem where decision variables encode element placement, constraints enforce required geometric relationships, and the objective function measures layout quality. We present existing research using this framework, describing the kind of optimization problem being solved and the basic optimization techniques used to solve it. Our review focuses on the formatting of primarily textual documents, including both micro and macrotypographic concerns. We also cover techniques for automatic table layout. Related problems such as widget and diagram layout, as well as temporal layout issues that arise in multimedia documents are outside the scope of this review.
Approximation of Staircases By Staircases
, 1992
"... The simplest nontrivial monotone functions are "staircases." The problem arises: what is the best approximation of some monotone function f(x) by a staircase with M jumps? In particular: what if f(x) is itself a staircase with N , N ? M , steps? This paper considers algorithms for solving ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
The simplest nontrivial monotone functions are "staircases." The problem arises: what is the best approximation of some monotone function f(x) by a staircase with M jumps? In particular: what if f(x) is itself a staircase with N , N ? M , steps? This paper considers algorithms for solving, and theorems relating to, this problem. All of the algorithms we propose are spaceoptimal up to a constant factor and and also runtimeoptimal except for at most a logarithmic factor. One application of our results is to "data compression" of probability distributions. We find yet another remarkable property of Monge's inequality, called the "concave cost as a function of zigzag number theorem." This property leads to new ways to get speedups in certain 1dimensional dynamic programming problems satisfying this inequality. Keywords  Histograms, data compression, cumulative distribution functions, approximation, monotone functions, dynamic programming, Monge's quadrangle inequality, concave cost...
Challenges in the Compilation of a Domain Specific Language for Dynamic Programming ABSTRACT
"... Many combinatorial optimization problems in biosequence analysis are solved via dynamic programming. To increase programming productivity and program reliability, a domain specific language embedded in Haskell has been suggested. We point out several shortcomings of this approach, and report on some ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Many combinatorial optimization problems in biosequence analysis are solved via dynamic programming. To increase programming productivity and program reliability, a domain specific language embedded in Haskell has been suggested. We point out several shortcomings of this approach, and report on some challenges in the (ongoing) project of migrating this domain specific language from its host language to a directly compiled implementation. Most of these challenges are domain specific optimizations, which not only improve significant constant factors of runtime and space requirements, but also affect asymptotic efficiency. We report on our solutions to some of these problems, and point out others that are still open.
Dynamic Programming as a Software Component
 Proceedings of CSCC
"... Abstract: Dynamic programming is usually regarded as a design technique, where each application is designed as an individual program. This contrasts with other techniques such as linear programming, where there exists a single generic program that solves all instances. From a software engineering p ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract: Dynamic programming is usually regarded as a design technique, where each application is designed as an individual program. This contrasts with other techniques such as linear programming, where there exists a single generic program that solves all instances. From a software engineering perspective, the lack of a generic solution to dynamic programming is somewhat unsatisfactory. It would be much preferable if dynamic programming could be understood as a software component, where the ideas common to all its applications are explicit in shared code. In this paper, we argue that such a component does indeed exist, at least for a large class of applications in which the decision process is a sequential scan of the input sequence. We also assess the suitability of C++ for expressing this type of generic program, and argue that the simplicity offered by lazy functional programming is preferable. In particular, functional programs can be manipulated as algebraic expressions. The paper does not present any novel results: it is an introduction to recent work on the formalisation of algorithmic paradigms in software engineering. KeyWords: Dynamic programming; sequential decision process; software component; functional programming; algebra of programming; program derivation. 1
The Construction of Huffman Codes is a Submodular (`Convex') Optimization Problem over a Lattice of Binary Trees
, 1996
"... We show that the space of all binary Huffman codes for a finite alphabet defines a lattice, ordered by the imbalance of the code trees. Representing code trees as pathlength sequences, we show that the imbalance ordering is closely related to a majorization ordering on realvalued sequences that co ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We show that the space of all binary Huffman codes for a finite alphabet defines a lattice, ordered by the imbalance of the code trees. Representing code trees as pathlength sequences, we show that the imbalance ordering is closely related to a majorization ordering on realvalued sequences that correspond to discrete probability density functions. Furthermore, this tree imbalance is a partial ordering that is consistent with the total orderings given by either the external path length (sum of tree path lengths), or the entropy determined by the tree structure. On the imbalance lattice, we show the weighted pathlength of a tree (the usual objective function for Huffman coding) is a submodular function, as is the corresponding function on the majorization lattice. Submodular functions are discrete analogues of convex functions. These results give perspective on Huffman coding, and suggest new approaches to coding as optimization over a lattice. 1 Introduction Traditionally, the Huffm...