Results 1  10
of
19
A polyhedral approach to sequence alignment problems
 DISCRETE APPL. MATH
, 2000
"... We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branchandcut algorithms. The Generalized Maximum Trace formulation captures several forms of multiple sequence alignment problems in a common framewo ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
We study two new problems in sequence alignment both from a practical and a theoretical view, using tools from combinatorial optimization to develop branchandcut algorithms. The Generalized Maximum Trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them the original formulation of Maximum Trace. The RNA Sequence Alignment Problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. Both problems have a characterization in terms of graphs which we reformulate in terms of integer linear programming. We then study the polytopes (or convex hulls of all feasible solutions) associated with the integer linear program for both problems. For each polytope we derive several classes of facetdefining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. This leads to a polynomial time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branchandcut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.
Memorybounded A* graph search
 In Proc. 15th International Flairs Conference
, 2002
"... We describe a framework for reducing the space complexity of graph search algorithms such as A* that use Open and Closed lists to keep track of the frontier and interior nodes of the search space. We propose a sparse representation of the Closed list in which only a fraction of already expanded node ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
We describe a framework for reducing the space complexity of graph search algorithms such as A* that use Open and Closed lists to keep track of the frontier and interior nodes of the search space. We propose a sparse representation of the Closed list in which only a fraction of already expanded nodes need to be stored to perform the two functions of the Closed List preventing duplicate search effort and allowing solution extraction. Our proposal is related to earlier work on search algorithms that do not use a Closed list at all [Korf and Zhang, 2000]. However, the approach we describe has several advantages that make it effective for a wider variety of problems. 1
Sweep A*: Spaceefficient heuristic search in partially ordered graphs
 In Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
, 2003
"... We describe a novel heuristic search algorithm, called Sweep A*, that exploits the regular structure of partially ordered graphs to substantially reduce the memory requirements of search. We show that it outperforms previous search algorithms in optimally aligning multiple protein or DNA sequences, ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
We describe a novel heuristic search algorithm, called Sweep A*, that exploits the regular structure of partially ordered graphs to substantially reduce the memory requirements of search. We show that it outperforms previous search algorithms in optimally aligning multiple protein or DNA sequences, an important problem in bioinformatics. Sweep A * also promises to be effective for other search problems with similar structure. 1.
Memoryefficient A* heuristics for multiple sequence alignment
 In National Conference on Artificial Intelligence (AAAI02
, 2002
"... The time and space needs of an A * search are strongly influenced by the quality of the heuristic evaluation function. Usually there is a tradeoff since better heuristics may require more time and/or space to evaluate. Multiple sequence alignment is an important application for singleagent searc ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
The time and space needs of an A * search are strongly influenced by the quality of the heuristic evaluation function. Usually there is a tradeoff since better heuristics may require more time and/or space to evaluate. Multiple sequence alignment is an important application for singleagent search. The traditional heuristic uses multiple pairwise alignments that require relatively little space. Threeway alignments produce better heuristics, but they are not used in practice due to the large space requirements. This paper presents a memoryefficient way to represent threeway heuristics as an octree. The required portions of the octree are computed on demand. The octreesupported threeway heuristics result in such a substantial reduction to the size of the A * open list that they offset the additional space and time requirements for the threeway alignments. The resulting multiple sequence alignments are both faster and use less memory than using A * with traditional pairwise heuristics.
Spaceefficient memorybased heuristics
 In National Conference on Artificial Intelligence (AAAI04
, 2004
"... A memorybased heuristic is a heuristic function that is stored in a lookup table. Very accurate heuristics have been created by building very large lookup tables, sometimes called pattern databases. Most previous work assumes that a memorybased heuristic is computed for the entire state space, an ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
A memorybased heuristic is a heuristic function that is stored in a lookup table. Very accurate heuristics have been created by building very large lookup tables, sometimes called pattern databases. Most previous work assumes that a memorybased heuristic is computed for the entire state space, and the cost of computing it is amortized over many problem instances. But in some cases, it may be useful to compute a memorybased heuristic for a single problem instance. If the start and goal states of the problem instance are used to restrict the region of the state space for which the heuristic is needed, the time and space used to compute the heuristic may be substantially reduced. In this paper, we review recent work that uses this idea to compute spaceefficient heuristics for the multiple sequence alignment problem. We then describe a novel development of this idea that is simpler and more general. Our approach leads to improved performance in solving the multiple sequence alignment problem, and is general enough to apply to other domains.
Multiple sequence alignment with arbitrary gap costs: Computing an optimal solution using polyhedral combinatorics
, 2002
"... ..."
A BranchandCut Algorithm for Multiple Sequence Alignment
"... Abstract. We consider a branchandcut approach for solving the multiple sequence alignment problem, which is a central problem in computational biology. We propose a general model for this problem in which arbitrary gap costs are allowed. An interesting aspect of our approach is that the three (exp ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We consider a branchandcut approach for solving the multiple sequence alignment problem, which is a central problem in computational biology. We propose a general model for this problem in which arbitrary gap costs are allowed. An interesting aspect of our approach is that the three (exponentially large) classes of natural valid inequalities that we consider turn out to be both facetdefining for the convex hull of integer solutions and separable in polynomial time. Both the proofs that these classes of valid inequalities are facetdefining and the description of the separation algorithms are far from trivial. Experimental results on several benchmark instances show that our method outperforms the best tools developed so far, in that it produces alignments that are better from a biological point of view. A noteworthy outcome of the results is the effectiveness of using branchandcut with only a carefullyselected subset of the variables as a heuristic. 1.
Protein Multiple Sequence Alignment
"... Protein sequence alignment is the task of identifying evolutionarily or structurally related positions in a collection of amino acid sequences. Although the protein alignment problem has been studied for several decades, many recent studies have demonstrated considerable progress in improving the ac ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Protein sequence alignment is the task of identifying evolutionarily or structurally related positions in a collection of amino acid sequences. Although the protein alignment problem has been studied for several decades, many recent studies have demonstrated considerable progress in improving the accuracy or scalability of multiple and pairwise alignment tools, or in expanding the scope of tasks handled by an alignment program. In this chapter, we review stateoftheart protein sequence alignment and provide practical advice for users of alignment tools.
A Lagrangian Relaxation Approach for the Multiple Sequence Alignment Problem
 in "Combinatorial Optimization and Applications, First International Conference, COCOA 2007, Xi’an Chine
"... Abstract. We present a branchandbound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing c ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We present a branchandbound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing certain inequalities, the Lagrangian subproblem becomes a pairwise alignment problem, which can be solved efficiently by a dynamic programming approach. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation, although preliminary, outperforms all exact algorithms for the multiple sequence alignment problem. 1