Results 1  10
of
13
Multiple sequence alignment
 Protein Structure Prediction — Methods and Protocols
, 2000
"... Multiple sequence alignment is a central problem in Bioinformatics and a challenging one for optimisation algorithms. An established integer programming approach is to apply branchandcut to a graphtheoretical model. The models are exponentially large but are represented intensionally, and violate ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Multiple sequence alignment is a central problem in Bioinformatics and a challenging one for optimisation algorithms. An established integer programming approach is to apply branchandcut to a graphtheoretical model. The models are exponentially large but are represented intensionally, and violated constraints can be located in polynomial time. This report describes a new integer program formulation that generates polynomialsized models small enough to be passed to generic solvers. It is a hybrid formulation relating the sparse alignment graph with a compact encoding of the alignment matrix via channelling constraints. Alignments obtained with a pseudoBoolean local search algorithm are competitive with those of stateoftheart algorithms. Execution times are much longer, but in future work we aim to develop a more efficient specialised algorithm. 1
A SATBased Approach to Multiple Sequence Alignment
 Poster, Ninth International Conference on Principles and Practice of Constraint Programming
, 2003
"... Abstract. Multiple sequence alignment is a central problem in Bioinformatics. A known integer programming approach is to apply branchandcut to exponentially large graphtheoretic models. This paper describes a new integer program formulation that generates models small enough to be passed to gener ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Abstract. Multiple sequence alignment is a central problem in Bioinformatics. A known integer programming approach is to apply branchandcut to exponentially large graphtheoretic models. This paper describes a new integer program formulation that generates models small enough to be passed to generic solvers. The formulation is a hybrid relating the sparse alignment graph with a compact encoding of the alignment matrix via channelling constraints. Alignments obtained with a SATbased local search algorithm are competitive with those of stateoftheart algorithms, though execution times are much longer. 1
A Lagrangian Relaxation Approach for the Multiple Sequence Alignment Problem
 in "Combinatorial Optimization and Applications, First International Conference, COCOA 2007, Xi’an Chine
"... Abstract. We present a branchandbound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing c ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. We present a branchandbound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing certain inequalities, the Lagrangian subproblem becomes a pairwise alignment problem, which can be solved efficiently by a dynamic programming approach. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation, although preliminary, outperforms all exact algorithms for the multiple sequence alignment problem. 1
Algorithm engineering for optimal alignment of protein
, 2011
"... (will be inserted by the editor) ..."
Ariel S Schwartz Alignment Metric Accuracy
, 2008
"... We propose a metric for the space of multiple sequence alignments that can be used to compare two alignments to each other. In the case where one of the alignments is a reference alignment, the resulting accuracy measure improves upon previous approaches, and provides a balanced assessment of the fi ..."
Abstract
 Add to MetaCart
We propose a metric for the space of multiple sequence alignments that can be used to compare two alignments to each other. In the case where one of the alignments is a reference alignment, the resulting accuracy measure improves upon previous approaches, and provides a balanced assessment of the fidelity of both matches and gaps. Furthermore, in the case where a reference alignment is not available, we provide empirical evidence that the distance from an alignment produced by one program to predicted alignments from other programs can be used as a control for multiple alignment experiments. In particular, we show that low accuracy alignments can be effectively identified and discarded. We also show that in the case of pairwise sequence alignment, it is possible to find an alignment that maximizes the expected value of our accuracy measure. Unlike previous approaches based on expected accuracy alignment that tend to maximize sensitivity at the expense of specificity, our method is able to identify unalignable sequence, thereby increasing overall accuracy. In addition, the algorithm allows for control of the sensitivity/specificity tradeoff via the adjustment of a single parameter. These results are confirmed with simulation studies that show that unalignable regions can be distinguished from homologous, conserved sequences. Finally, we propose an extension of the pairwise alignment method to multiple alignment. Our method, which we call AMAP, outperforms existing protein sequence multiple alignment programs on benchmark datasets. A webserver and software downloads are available at
Multiple Structural RNA Alignment with Lagrangian Relaxation
"... Abstract. In contrast to proteins, many classes of functionally related RNA molecules show a rather weak sequence conservation but instead a fairly well conserved secondary structure. Hence it is clear, that any method that relates RNA sequences in form of (multiple) alignments should take structura ..."
Abstract
 Add to MetaCart
Abstract. In contrast to proteins, many classes of functionally related RNA molecules show a rather weak sequence conservation but instead a fairly well conserved secondary structure. Hence it is clear, that any method that relates RNA sequences in form of (multiple) alignments should take structural features into account. Since multiple alignments are of great importance for subsequent data analysis, research in improving the speed and accuracy of such alignments benefits many other analysis problems. We present a formulation for computing provably optimal, structurebased, multiple RNA alignments and give an algorithm that finds such an optimal solution or at least a very good approximation of it. Our formulation is based on the structural trace formulation of Reinert et al. and uses a recently proposed weighting function of Hofacker et al. that makes use of McCaskill’s approach to compute base pair probability functions. To solve the resulting computational problem we propose an algorithm based on Lagrangian relaxation which already proved useful in the twosequence case. We compare our implementation, mLARA, to two recent programs (MARNA and pmmulti) and demonstrate that we can compute multiple alignments with consensus structures that have a significant lower minimum free energy term than computed by the other programs. Our prototypical experiments indicate that our new algorithm is the first approach to successfully compute provably optimal multiple structural alignments in reasonable computation time. Further advantages are its applicablity to long sequences where standard dynamic programming approaches must fail and its ability to deal with pseudoknot structures. 1
Probabilistic Comparative String Analysis
, 2005
"... Comparative string data has proven to be a valuable resource for improving the accuracy of computational methods for string analysis. In this report we describe the characteristics of comparative string data, focusing on biological sequences, and natural language text. We then describe a general pro ..."
Abstract
 Add to MetaCart
Comparative string data has proven to be a valuable resource for improving the accuracy of computational methods for string analysis. In this report we describe the characteristics of comparative string data, focusing on biological sequences, and natural language text. We then describe a general probabilistic framework for analyzing pairs of strings, show how posterior based methods can be used to improve accuracy, and discuss ways to extend the framework to multiple sequences. We apply the posterior based probabilistic framework to sequence alignment, which is the most fundamental problems in sequence analysis. While this is a well studied problem, we show that posterior based methods can produce biological sequence alignments that are more accurate than any of the current state of the art methods. Although comparative gene finding is a more complex problem than sequence alignment, it can be modeled using a similar probabilistic model. We describe a comparative gene finding algorithm that uses posterior probabilities to integrate comparative data from