Results 1 -
6 of
6
Computing the Similarity of Two Sequences with Nested Arc Annotations
- Theoretical Computer Science
, 2003
"... We present exact algorithms for the NP-complete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, one of our algorithms determines (if existent) in O(3.3 ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
We present exact algorithms for the NP-complete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, one of our algorithms determines (if existent) in O(3.31 time an arc-preserving subsequence of both sequences, which can be obtained by deleting (together with corresponding arcs) k 1 letters from the first and k 2 letters from the second sequence. A second algorithm shows that (in case of a four letter alphabet) we can find a length l arc-annotated subsequence in O(12 n) time. This means that the problem is fixed-parameter tractable when parameterized by the number of deletions as well as when parameterized by the subsequence length. Our findings complement known approximation results which give a quadratic time factor-2-approximation for the general and polynomial time approximation schemes for restricted versions of the problem. In addition, we obtain further fixed-parameter tractability results for these restricted versions.
Towards Optimally Solving the Longest Common Subsequence Problem for Sequences with Nested Arc Annotations in Linear Time
- In Proc. of the 13th Symposium on Combinatorial Pattern Matching (CPM02), volume 2373 of LNCS
, 2002
"... We present exact algorithms for the NP-complete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, our algorithm determines (if existent) in time O(3.3 ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We present exact algorithms for the NP-complete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, our algorithm determines (if existent) in time O(3.31 k 1 +k 2 n) an arc-preserving subsequence of both sequences, which can be obtained by deleting (together with corresponding arcs) k1 letters from the first and k2 letters from the second sequence. Thus, the problem is fixed-parameter tractable when parameterized by the number of deletions. This complements known approximation results which give a quadratic time factor-2-approximation for the general and polynomial time approximation schemes for restricted versions of the problem. In addition, we obtain further fixed-parameter tractability results for these restricted versions.
Exact Algorithms for the Longest Common Subsequence Problem for Arc-Annotated Sequences
- Diploma thesis, Universität Tübingen, Fed. Rep. of
, 2002
"... Contents 1 Introduction 5 2 Biological Motivation 9 2.1 Some Molecular Biology . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Biological Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Some Basic Definitions 13 3.1 LCS and Some Problems from Graph Theory . . . . . . . . . . . ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Contents 1 Introduction 5 2 Biological Motivation 9 2.1 Some Molecular Biology . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Biological Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Some Basic Definitions 13 3.1 LCS and Some Problems from Graph Theory . . . . . . . . . . . 13 3.2 Parameterized Complexity . . . . . . . . . . . . . . . . . . . . . . 15 3.3 Arc Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4 Previous Results 27 4.1 Classical Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2 Parameterized Complexity . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Complexity of Arc-Preserving Subsequence Problem . . . . . . . 30 4.4 Overview of This Work . . . . . . . . . . . . . . . . . . . . . . . 31 5 c-fragment, c-diagonal LAPCS 33 5.1 c-fragment<F10
Parameterized Complexity and Biopolymer Sequence Comparison
, 2007
"... The paper surveys parameterized algorithms and complexities for computational tasks on biopolymer sequences, including the problems of longest common subsequence, shortest common supersequence, pairwise sequence alignment, multiple sequencing alignment, structure–sequence alignment and structure–str ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The paper surveys parameterized algorithms and complexities for computational tasks on biopolymer sequences, including the problems of longest common subsequence, shortest common supersequence, pairwise sequence alignment, multiple sequencing alignment, structure–sequence alignment and structure–structure alignment. Algorithm techniques, built on the structural-unit level as well as on the residue level, are discussed.
Inferring an Original Sequence from Erroneous Copies: a Bayesian Approach
- Asia-Paci®c BioTech News
, 2003
"... This paper considers the problem of inferring an original sequence from a number of erroneous copies. The problem arises in DNA sequencing, particularly in the context of emerging technologies that provide high throughput or other advantages, but at the cost of introducing many errors. We develop a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper considers the problem of inferring an original sequence from a number of erroneous copies. The problem arises in DNA sequencing, particularly in the context of emerging technologies that provide high throughput or other advantages, but at the cost of introducing many errors. We develop a Bayesian probabilistic model of the introduction of errors, and search for a sequence that has maximum posterior probability with respect to the model. We present results of extensive tests in which error-prone sequencing of real DNA was simulated. The results obtained using the new approach are compared to results obtained by deriving a consensus sequence from a multiple sequence alignment. We find that a significant improvement in accuracy is obtained using the new approach. The implication is that high error levels need not be a barrier to the adoption of sequencing technologies that are in other respects promising, because most errors can be detected and corrected using a small number of reads.
Algorithms on Constrained Sequence Alignment
, 2004
"... One of the fundamental issues that arises in computational biology is Multiple Sequence Alignment (MSA), which needs to be addressed in many applications of Bioinformatics (e.g. study of the SARS Coronavirus and the Human Genome Project). Many algorithms have been proposed to solve the MSA problem, ..."
Abstract
- Add to MetaCart
One of the fundamental issues that arises in computational biology is Multiple Sequence Alignment (MSA), which needs to be addressed in many applications of Bioinformatics (e.g. study of the SARS Coronavirus and the Human Genome Project). Many algorithms have been proposed to solve the MSA problem, but often cannot incorporate users' (biologists') knowledge of the functionalities or structures of these sequences into their solutions. This kind of information is very useful for an accurate and biologically meaningful alignment. The Constrained Multiple Sequence Alignment (CMSA) was proposed by Tang et al. (2002) to rectify the shortcomings of MSA by introducing a constrained sequence to represent more important residues in the sequences. Every character of the constrained sequence has to appear in an entire column in the alignment of the multiple sequences, and in the same order as in the constrained sequence.

