Results 11  20
of
22
A Lagrangian Relaxation Approach for the Multiple Sequence Alignment Problem
 in "Combinatorial Optimization and Applications, First International Conference, COCOA 2007, Xi’an Chine
"... Abstract. We present a branchandbound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing c ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We present a branchandbound (bb) algorithm for the multiple sequence alignment problem (MSA), one of the most important problems in computational biology. The upper bound at each bb node is based on a Lagrangian relaxation of an integer linear programming formulation for MSA. Dualizing certain inequalities, the Lagrangian subproblem becomes a pairwise alignment problem, which can be solved efficiently by a dynamic programming approach. Due to a reformulation w.r.t. additionally introduced variables prior to relaxation we improve the convergence rate dramatically while at the same time being able to solve the Lagrangian problem efficiently. Our experiments show that our implementation, although preliminary, outperforms all exact algorithms for the multiple sequence alignment problem. 1
Kgroup A* for multiple sequence alignment with quasinatural gap costs
 In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI04
, 2004
"... Alignment of multiple protein or DNA sequences is an important problem in Bioinformatics. Previous work has shown that the A * search algorithm can find optimal alignments for up to several sequences, and that a Kgroup generalization of A * can find approximate alignments for much larger numbers of ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Alignment of multiple protein or DNA sequences is an important problem in Bioinformatics. Previous work has shown that the A * search algorithm can find optimal alignments for up to several sequences, and that a Kgroup generalization of A * can find approximate alignments for much larger numbers of sequences [6]. In this paper, we describe the first implementation of Kgroup A * that uses quasinatural gap costs, the cost model used in practice by biologists. We also introduce a new method for computing gapopening costs in profile alignment. Our results show that Kgroup A * can efficiently find optimal or closetooptimal alignments for small groups of sequences, and, for large numbers of sequences, it can find higherquality alignments than the widelyused CLUSTAL family of approximate alignment tools. This demonstrates the benefits of A* in aligning large numbers of sequences, as typically compared by biologists, and suggests that Kgroup A * could become a practical tool for multiple sequence alignment. 1.
Faster algorithms for optimal multiple sequence alignment based on pairwise comparisons
 in Proceedings of the IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB
, 2006
"... Abstract—Multiple Sequence Alignment (MSA) is one of the most fundamental problems in computational molecular biology. The running time of the best known scheme for finding an optimal alignment, based on dynamic programming, increases exponentially with the number of input sequences. Hence, many heu ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Multiple Sequence Alignment (MSA) is one of the most fundamental problems in computational molecular biology. The running time of the best known scheme for finding an optimal alignment, based on dynamic programming, increases exponentially with the number of input sequences. Hence, many heuristics were suggested for the problem. We consider a version of the MSA problem where the goal is to find an optimal alignment in which matches are restricted to positions in predefined matching segments. We present several techniques for making the dynamic programming algorithm more efficient, while still finding an optimal solution under these restrictions. We prove that it suffices to find an optimal alignment of the predefined sequence segments, rather than single letters, thereby reducing the input size and thus improving the running time. We also identify “shortcuts ” that expedite the dynamic programming scheme. Empirical study shows that, taken together, these observations lead to an improved running time over the basic dynamic programming algorithm by 4 to 12 orders of magnitude, while still obtaining an optimal solution. Under the additional assumption that matches between segments are transitive, we further improve the running time for finding the optimal solution by restricting the search space of the dynamic programming algorithm. Index Terms—Multiple Sequence Alignment, algorithms, dynamic programming, shortest path. Ç 1
Optimal SumofPairs Multiple Sequence Alignment using Incremental CarrilloandLipman Bounds.
"... Accepted for publication in Journal of Computational Biology Alignment of sequences is an important routine in various areas of science, notably molecular biology. Multiple sequence alignment is a computationally hard optimization problem which involves the consideration of different possible alignm ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Accepted for publication in Journal of Computational Biology Alignment of sequences is an important routine in various areas of science, notably molecular biology. Multiple sequence alignment is a computationally hard optimization problem which involves the consideration of different possible alignments in order to find an optimal one, given a measure of goodness of alignments. Dynamic programming algorithms are generally wellsuited for the search of optimal alignments, but are constrained by unwieldy space requirements for large numbers of sequences. Carrillo and Lipman devised a method that helps to reduce the search space for an optimal alignment under a sumofpairs measure using bounds on the scores of its pairwise projections. In this paper we generalize Carrillo and Lipman bounds and demonstrate a novel approach for finding optimal sumofpairs multiple alignments that allows incremental pruning of the optimal alignment search space. This approach can result in a drastic pruning of the final search space polytope (where we search for the optimal alignment) when compared to Carrillo and Lipman’s approach and hence allows many runs that are not feasible with the original method. 2 1
External Memory BestFirst Search for Multiple Sequence Alignment
"... mhatem and ruml at cs.unh.edu Multiple sequence alignment (MSA) is a central problem in computational biology. It is well known that MSA can be formulated as a shortest path problem and solved using heuristic search, but the memory requirement of A * makes it impractical for all but the smallest p ..."
Abstract
 Add to MetaCart
mhatem and ruml at cs.unh.edu Multiple sequence alignment (MSA) is a central problem in computational biology. It is well known that MSA can be formulated as a shortest path problem and solved using heuristic search, but the memory requirement of A * makes it impractical for all but the smallest problems. Partial Expansion A* (PEA*) reduces the memory requirement of A * by generating only the most promising successor nodes. However, even PEA * exhausts available memory on many problems. Another alternative is Iterative Deepening Dynamic Programming, which uses an uninformed search order but stores only the nodes along the search frontier. However, it too cannot scale to the largest problems. In this paper, we propose storing nodes on cheap and plentiful secondary storage. We present a new generalpurpose algorithm, Parallel External PEA * (PE2A*), that combines PEA * with Delayed Duplicate Detection to take advantage of external memory and multiple processors to solve large MSA problems. In our experiments, PE2A * is the first algorithm capable of solving the entire Reference Set 1 of the standard BAliBASE benchmark using a biologically accurate cost function. This work suggests that external bestfirst search can effectively use heuristic information to surpass methods that rely on uninformed search orders.
Sorting with Line Storage Systems
"... We consider a problem that arises in car production. In particular, we focus on the sorting of car bodies with respect to their designated enamel colors for a paint shop. Our objective is to sort a given car body sequence so that the number of color changes within the sorted sequence is minimized. C ..."
Abstract
 Add to MetaCart
We consider a problem that arises in car production. In particular, we focus on the sorting of car bodies with respect to their designated enamel colors for a paint shop. Our objective is to sort a given car body sequence so that the number of color changes within the sorted sequence is minimized. Current technology is to perform the realignment of a sequence by the use of a line storage system.
unknown title
, 2004
"... A memorybased heuristic is a heuristic function that is stored in a lookup table. Very accurate heuristics have been created by building very large lookup tables, sometimes called pattern databases. Most previous work assumes that a memorybased heuristic is computed for the entire state space, and ..."
Abstract
 Add to MetaCart
(Show Context)
A memorybased heuristic is a heuristic function that is stored in a lookup table. Very accurate heuristics have been created by building very large lookup tables, sometimes called pattern databases. Most previous work assumes that a memorybased heuristic is computed for the entire state space, and the cost of computing it is amortized over many problem instances. But in some cases, it may be useful to compute a memorybased heuristic for a single problem instance. If the start and goal states of the problem instance are used to restrict the region of the state space for which the heuristic is needed, the time and space used to compute the heuristic may be substantially reduced. In this paper, we review recent work that uses this idea to compute spaceefficient heuristics for the multiple sequence alignment problem. We then describe a novel development of this idea that is simpler and more general. Our approach leads to improved performance in solving the multiple sequence alignment problem, and is general enough to apply to other domains.