Results 1  10
of
167
Combinatorial pattern discovery in biological sequences: the TEIRESIAS algorithm
 BIOINFORMATICS
, 1998
"... Motivation: The discovery of motifs in biological sequences is an important problem. Results: This paper presents a new algorithm for the discovery of rigid patterns (motifs) in biological sequences. Our method is combinatorial in nature and able to produce all patterns that appear in at least a (us ..."
Abstract

Cited by 228 (14 self)
 Add to MetaCart
Motivation: The discovery of motifs in biological sequences is an important problem. Results: This paper presents a new algorithm for the discovery of rigid patterns (motifs) in biological sequences. Our method is combinatorial in nature and able to produce all patterns that appear in at least a (userdefined) minimum number of sequences, yet it manages to be very efficient by avoiding the enumeration of the entire pattern space. Furthermore, the reported patterns are maximal: any reported pattern cannot be made more specific and still keep on appearing at the exact same positions within the input sequences. The effectiveness of the proposed approach is showcased on a number of test cases which aim to: (i) validate the approach through the discovery of previously reported patterns; (ii) demonstrate the capability to identify automatically highly selective patterns particular to the sequences under consideration. Finally, experimental analysis indicates that the algorithm is output sensitive, i.e. its running time is quasilinear to the size of the generated output.
Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots
 Discrete Applied Mathematics
, 2000
"... structure prediction with pseudoknots ..."
(Show Context)
Improving the Practical Space and Time Efficiency of the ShortestPaths Approach to SumofPairs Multiple Sequence Alignment
, 1996
"... The MSA program, written and distributed in 1989, is one of the few existing programs that attempts to find optimal alignments of multiple protein or DNA sequences. The MSA program implements a branchandbound technique together with a variant of Dijkstra's shortest paths algorithm to prune th ..."
Abstract

Cited by 76 (4 self)
 Add to MetaCart
The MSA program, written and distributed in 1989, is one of the few existing programs that attempts to find optimal alignments of multiple protein or DNA sequences. The MSA program implements a branchandbound technique together with a variant of Dijkstra's shortest paths algorithm to prune the basic dynamic programming graph. We have made substantial improvements in the time and space usage of MSA. The improvements make feasible a variety of problem instances that were not feasible previously. On some runs we achieve an order of magnitude reduction in space usage and a significant multiplicative factor speedup in running time. To explain that these improvements work, we give a much more detailed description of MSA than has been previously available. In practice, MSA rarely produces a provably optimal alignment and we explain why.
On the Complexity of Loop Fusion
 Parallel Computing
, 1999
"... Loop fusion is a program transformation that combines several loops into one. It is used in parallelizing compilers mainly for increasing the granularity of loops and for improving data reuse. The goal of this paper is to study, from a theoretical point of view, several variants of the loop fusion p ..."
Abstract

Cited by 58 (2 self)
 Add to MetaCart
(Show Context)
Loop fusion is a program transformation that combines several loops into one. It is used in parallelizing compilers mainly for increasing the granularity of loops and for improving data reuse. The goal of this paper is to study, from a theoretical point of view, several variants of the loop fusion problem identifying polynomially solvable cases and NPcomplete cases and to make the link between these problems and some scheduling problems that arise from completely dierent areas. We study, among others, the fusion of loops of dierent types, and the fusion of loops when combined with loop shifting. Key words: Parallelization, loop fusion, loop distribution, complexity. 1 Introduction Loop fusion is a program transformation that collapses several loops into one. The resulting program compaction and the corresponding increase in the size of the loop body has several wellknown impacts on the performances of a program [18]. It was rst used to reduce the cost of loop bound testing....
Algorithms and Complexity for Annotated Sequence Analysis
, 1999
"... Molecular biologists use algorithms that compare and otherwise analyze sequences that represent genetic and protein molecules. Most of these algorithms, however, operate on the basic sequence and do not incorporate the additional information that is often known about the molecule and its pieces. Thi ..."
Abstract

Cited by 52 (1 self)
 Add to MetaCart
(Show Context)
Molecular biologists use algorithms that compare and otherwise analyze sequences that represent genetic and protein molecules. Most of these algorithms, however, operate on the basic sequence and do not incorporate the additional information that is often known about the molecule and its pieces. This research describes schemes to combinatorially annotate this information onto sequences so that it can be analyzed in tandem with the sequence; the overall result would thus reflect both types of information about the sequence. These annotation schemes include adding colours and arcs to the sequence. Colouring a sequence would produce a samelength sequence of colours or other symbols that highlight or label parts of the sequence. Arcs can be used to link sequence symbols (or coloured substrings) to indicate molecular bonds or other relationships. Adding these annotations to sequence analysis problems such as sequence alignment or finding the longest common subsequence can make the problem more complex, often depending on the complexity of the annotation scheme. This research examines the different annotation schemes and the corresponding problems of verifying annotations, creating annotations, and finding the longest common subsequence of pairs of sequences with annotations. This work involves both the conventional complexity framework and parameterized complexity, and includes algorithms and hardness results for both frameworks. Automata and transducers are created for some annotation verification and creation problems. Different restrictions on layered substring and arc annotation are considered to de iii termine what properties an annotation scheme must have to make its incorporation feasible. Extensions to the algorithms that use weighting schemes are explored. Examin...
The Parameterized Complexity of Sequence Alignment and Consensus
, 1994
"... The Longest common subsequence problem is examined from the point of view of parameterized computational complexity. There are several different ways in which parameters enter the problem, such as the number of sequences to be analyzed, the length of the common subsequence, and the size of the alpha ..."
Abstract

Cited by 47 (12 self)
 Add to MetaCart
(Show Context)
The Longest common subsequence problem is examined from the point of view of parameterized computational complexity. There are several different ways in which parameters enter the problem, such as the number of sequences to be analyzed, the length of the common subsequence, and the size of the alphabet. Lower bounds on the complexity of this basic problem imply lower bounds on a number of other sequence alignment and consensus problems. At issue in the theory of parameterized complexity is whether a problem which takes input (x; k) can be solved in time f(k) \Delta n ff where ff is independent of k (termed fixedparameter tractability). It can be argued that this is the appropriate asymptotic model of feasible computability for problems for which a small range of parameter values covers important applications  a situation which certainly holds for many problems in biological sequence analysis. Our main results show that: (1) The Longest Common Subsequence (LCS) parameterized by t...
On the Parameterized Complexity of the fixed Alphabet Shortest Common Supersequence and Longest Common Subsequence Problems
, 2003
"... INTRODUCTION The Shortest Common Supersequence (SCS) and the Longest Common Subsequence (LCS) are classical problems in computer science. Shortest Common Supersequence (SCS) integer . is a supersequence Longest Common Subsequence (LCS) integer . is a subsequence The LCS and (not so m ..."
Abstract

Cited by 45 (0 self)
 Add to MetaCart
(Show Context)
INTRODUCTION The Shortest Common Supersequence (SCS) and the Longest Common Subsequence (LCS) are classical problems in computer science. Shortest Common Supersequence (SCS) integer . is a supersequence Longest Common Subsequence (LCS) integer . is a subsequence The LCS and (not so much) the SCS problems have been extensively studied over the last 30 years (see [7] and references). They are both known to be NPcomplete [8, 9]. In particular the case where the number of sequences is 2 has been studied in detail (see [7] and references). A string a is a supersequence of a string b if we can delete some characters in a such that the remaining string is equal to b, e.g. \1234" is a supersequence of \13". A string a is a subsequence of a string b if b is a supersequence of a, e.g. \13" is a subsequence of \1234". 1.1. Sequence Comparison in Bioinformatics With the recent availability of large amounts of molecular sequence data, the LCS and related problems received
Longest Common Subsequences
 In Proc. of 19th MFCS, number 841 in LNCS
, 1994
"... . The length of a longest common subsequence (LLCS) of two or more strings is a useful measure of their similarity. The LLCS of a pair of strings is related to the `edit distance', or number of mutations /errors/editing steps required in passing from one string to the other. In this talk, we ex ..."
Abstract

Cited by 38 (1 self)
 Add to MetaCart
. The length of a longest common subsequence (LLCS) of two or more strings is a useful measure of their similarity. The LLCS of a pair of strings is related to the `edit distance', or number of mutations /errors/editing steps required in passing from one string to the other. In this talk, we explore some of the combinatorial properties of the suband supersequence relations, survey various algorithms for computing the LLCS, and introduce some results on the expected LLCS for pairs of random strings. 1 Introduction The set \Sigma of finite strings over an unordered finite alphabet \Sigma admits of several natural partial orders. Some, such as the substring, prefix, and suffix relations, depend on contiguity and lead to many interesting combinatorial questions with practical applications to stringmatching. An excellent survey is given by Aho in [1]. In this talk however we will focus on the `subsequence' partial order. We say that u = u 1 \Delta \Delta \Delta um is a subsequence of ...
The emergence of pattern discovery techniques in computational biology
 Metabolic Engineering
, 2000
"... In the past few years, pattern discovery has been emerging as a generic tool of choice for tackling problems from the computational biology domain. In this presentation, and after defining the problem in its generality, we review some of the algorithms that have appeared in the literature and descri ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
(Show Context)
In the past few years, pattern discovery has been emerging as a generic tool of choice for tackling problems from the computational biology domain. In this presentation, and after defining the problem in its generality, we review some of the algorithms that have appeared in the literature and describe several applications of pattern discovery to problems from computational biology. 2000 Academic Press 1.
Merging Separately Generated Plans with Restricted Interactions
, 1992
"... Generating action sequences to achieve a set of goals is a computationally difficult task. When multiple goals are present, the problem is even worse. Although many solutions to this problem have been discussed in the literature, practical solutions focus on the use of restricted mechanisms for plan ..."
Abstract

Cited by 33 (8 self)
 Add to MetaCart
Generating action sequences to achieve a set of goals is a computationally difficult task. When multiple goals are present, the problem is even worse. Although many solutions to this problem have been discussed in the literature, practical solutions focus on the use of restricted mechanisms for planning or the application of domain dependent heuristics for providing rapid solutions (i.e. domaindependent planning). One previously proposed technique for handling multiple goals efficiently is to design a planner or even a set of planners (usually domaindependent) that can be used to generate separate plans for each goal. The outputs are typically either restricted to be independent and then concatenated into a single global plan, or else they are merged together using complex heuristic techniques. In this paper we explore a set of limitations, less restrictive than the assumption of independence, that still allow for the efficient merging of separate plans using straightforward algorith...