Results 1  10
of
41
A MemoryEfficient Dynamic Programming Algorithm for Optimal Alignment of a Sequence to an RNA Secondary Structure
, 2002
"... Background: Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N³) in memory. This is only practical for small RNAs ..."
Abstract

Cited by 104 (11 self)
 Add to MetaCart
Background: Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N&sup3;) in memory. This is only practical for small RNAs. Results:...
A General Edit Distance between RNA Structures
, 2001
"... Arcannotated sequences are useful in representing the structural information of RNA sequences. ..."
Abstract

Cited by 90 (0 self)
 Add to MetaCart
Arcannotated sequences are useful in representing the structural information of RNA sequences.
Backofen R: MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons
 Bioinformatics
, 2005
"... of RNAs based on sequence structure comparisons ..."
Algorithmic aspects of protein structure similarity
 In 40th Annual Symposium on Foundations of Computer Science
, 1999
"... We show that calculating contact map overlap (a measure of similarity of protein structures) is NPhard, but can be solved in polynomial time for several interesting and relevant special cases. We identify an important special case of this problem corresponding to selfavoiding walks, and prove a dec ..."
Abstract

Cited by 60 (3 self)
 Add to MetaCart
(Show Context)
We show that calculating contact map overlap (a measure of similarity of protein structures) is NPhard, but can be solved in polynomial time for several interesting and relevant special cases. We identify an important special case of this problem corresponding to selfavoiding walks, and prove a decomposition theorem and a corollary approximation result for this special case. These are the rst approximation algorithms with guaranteed error bounds, and NPcompleteness results in the literature in the area of protein structure alignment/fold recognition for measures of structure similarity of practical interest. A
Algorithms and Complexity for Annotated Sequence Analysis
, 1999
"... Molecular biologists use algorithms that compare and otherwise analyze sequences that represent genetic and protein molecules. Most of these algorithms, however, operate on the basic sequence and do not incorporate the additional information that is often known about the molecule and its pieces. Thi ..."
Abstract

Cited by 52 (1 self)
 Add to MetaCart
(Show Context)
Molecular biologists use algorithms that compare and otherwise analyze sequences that represent genetic and protein molecules. Most of these algorithms, however, operate on the basic sequence and do not incorporate the additional information that is often known about the molecule and its pieces. This research describes schemes to combinatorially annotate this information onto sequences so that it can be analyzed in tandem with the sequence; the overall result would thus reflect both types of information about the sequence. These annotation schemes include adding colours and arcs to the sequence. Colouring a sequence would produce a samelength sequence of colours or other symbols that highlight or label parts of the sequence. Arcs can be used to link sequence symbols (or coloured substrings) to indicate molecular bonds or other relationships. Adding these annotations to sequence analysis problems such as sequence alignment or finding the longest common subsequence can make the problem more complex, often depending on the complexity of the annotation scheme. This research examines the different annotation schemes and the corresponding problems of verifying annotations, creating annotations, and finding the longest common subsequence of pairs of sequences with annotations. This work involves both the conventional complexity framework and parameterized complexity, and includes algorithms and hardness results for both frameworks. Automata and transducers are created for some annotation verification and creation problems. Different restrictions on layered substring and arc annotation are considered to de iii termine what properties an annotation scheme must have to make its incorporation feasible. Extensions to the algorithms that use weighting schemes are explored. Examin...
The Longest Common Subsequence Problem for ArcAnnotated Sequences
 In Proc. of 11th CPM, number 1848 in LNCS
, 2000
"... . Arcannotated sequences are useful in representing the structural information of RNA and protein sequences. Recently, the longest arcpreserving common subsequence problem has been introduced in [6, 7] as a framework for studying the similarity of arcannotated sequences. In this paper, we con ..."
Abstract

Cited by 41 (1 self)
 Add to MetaCart
. Arcannotated sequences are useful in representing the structural information of RNA and protein sequences. Recently, the longest arcpreserving common subsequence problem has been introduced in [6, 7] as a framework for studying the similarity of arcannotated sequences. In this paper, we consider arcannotated sequences with various arc structures and present some new algorithmic and complexity results on the longest arcpreserving common subsequence problem. Some of our results answer an open question in [6, 7] and some others improve the hardness results in [6, 7]. Keywords: sequence annotation, longest common subsequence, approximation algorithm, maximum independent set, MAX SNPhard, dynamic programming. 1
Accurate Multiple SequenceStructure Alignment of RNA Sequences Using Combinatorial Optimization
, 2007
"... Background: The discovery of functional noncoding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of lowhomology RNA sequences. The spatial conformation of RNA sequences ..."
Abstract

Cited by 40 (2 self)
 Add to MetaCart
(Show Context)
Background: The discovery of functional noncoding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of lowhomology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. Results: We present a graphbased representation for sequencestructure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or nearoptimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. Conclusions: The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input
Bafna V: Searching Genomes for Noncoding RNA Using FastR
 IEEE/ACM Trans. on Comput. Biol. and Bioinformatics
, 2005
"... Abstract—The discovery of novel noncoding RNAs has been among the most exciting recent developments in biology. It has been hypothesized that there is, in fact, an abundance of functional noncoding RNAs (ncRNAs) with various catalytic and regulatory functions. However, the inherent signal for ncRNA ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
(Show Context)
Abstract—The discovery of novel noncoding RNAs has been among the most exciting recent developments in biology. It has been hypothesized that there is, in fact, an abundance of functional noncoding RNAs (ncRNAs) with various catalytic and regulatory functions. However, the inherent signal for ncRNA is weaker than the signal for protein coding genes, making these harder to identify. We consider the following problem: Given an RNA sequence with a known secondary structure, efficiently detect all structural homologs in a genomic database by computing the sequence and structure similarity to the query. Our approach, based on structural filters that eliminate a large portion of the database while retaining the true homologs, allows us to search a typical bacterial genome in minutes on a standard PC. The results are two orders of magnitude better than the currently available software for the problem. We applied FastR to the discovery of novel riboswitches, which are a class of RNA domains found in the untranslated regions. They are of interest because they regulate metabolite synthesis by directly binding metabolites. We searched all available eubacterial and archaeal genomes for riboswitches from purine, lysine, thiamin, and riboflavin subfamilies. Our results point to a number of novel candidates for each of these subfamilies and include genomes that were not known to contain riboswitches. Index Terms—Noncoding RNA, database search, filtration, riboswitch, bacterial genome. 1
A polyhedral approach to sequence alignment problems
 DISCRETE APPL. MATH
, 2000
"... We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branchandcut algorithms. The Generalized Maximum Trace formulation captures several forms of multiple sequence alignment problems in a common framewo ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
(Show Context)
We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branchandcut algorithms. The Generalized Maximum Trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of Maximum Trace. The RNA Sequence Alignment Problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facetdefining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branchandcut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.
The ABACUS System for BranchandCutandPrice Algorithms in Integer Programming and Combinatorial Optimization
, 1998
"... The development of new mathematical theory and its application in software systems for the solution of hard optimization problems have a long tradition in mathematical programming. In this tradition we implemented ABACUS, an objectoriented software framework for branchandcutandprice algorithms ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
The development of new mathematical theory and its application in software systems for the solution of hard optimization problems have a long tradition in mathematical programming. In this tradition we implemented ABACUS, an objectoriented software framework for branchandcutandprice algorithms for the solution of mixed integer and combinatorial optimization problems. This paper discusses some difficulties in the implementation of branchandcutandprice algorithms for combinatorial optimization problems and shows how they are managed by ABACUS.