## The Practical Use of the A* Algorithm for Exact Multiple Sequence Alignment (1997)

Venue: | Journal of Computational Biology |

Citations: | 18 - 3 self |

### BibTeX

@ARTICLE{Lermen97thepractical,

author = {Martin Lermen and Knut Reinert},

title = {The Practical Use of the A* Algorithm for Exact Multiple Sequence Alignment},

journal = {Journal of Computational Biology},

year = {1997},

volume = {7},

pages = {2000}

}

### OpenURL

### Abstract

Multiple alignment is an important problem in computational biology. It is well known that it can be solved exactly by a dynamic programming algorithm which in turn can be interpreted as a shortest path computation in a directed acyclic graph. The A algorithm (or goal directed unidirectional search) is a technique that speeds up the computation of a shortest path by transforming the edge lengths without losing the optimality of the shortest path. We implemented the A algorithm in a computer program similar to MSA [GKS95] and FMA [SI97b]. We incorporated in this program new bounding strategies for both, lower and upper bounds and show that the A algorithm, together with our improvements, can speed up computations considerably. Additionally we show that the A algorithm together with a standard bounding technique is superior to the well known Carillo-Lipman bounding since it excludes more nodes from consideration. 1 Introduction One of the most prominent problems in computational mo...

### Citations

1410 |
A general method applicable to the search for similarities in the amino acid sequence of two proteins
- Needleman, Wunsch
- 1970
(Show Context)
Citation Context ...e sequence alignment. It is used for extracting and representing biologically important commonalities from a set of sequences. It is easy to generalize the standard algorithm of Needleman and Wunsch (=-=[NW70]-=-) to more than two sequences. However the time and space complexity grows exponentially in the number of sequences. Solving the problem to optimality is therefore only tractable for small problem inst... |

418 |
Combinatorial Algorithms for Integrated Circuit Layout
- Lengauer
(Show Context)
Citation Context ...) = L(q ! t) The redefinition of the edge weights directs the search in the grid more towards the sink node t. Therefore this technique is also called Goal Directed Unidirectional Search (GDUS) (c.f. =-=[Len90]-=-). We now want to apply the simple bounding procedure described above. With the new edge weights a shortest path from s to q has the length c 0 (s ! q) = c(s ! q) +L(q ! t) \Gamma L(s ! t). Since the ... |

396 |
Algorithms on strings, trees, and sequences: Computer science and computational biology
- Gusfield
- 1997
(Show Context)
Citation Context ...in from citing further seminal papers concerning pairwise and multiple alignment, because by now a general methodology has been established and the three quite recently published monographs (Gusfield =-=[Gus97]-=-, Setubal and Meidanis [SM97], Waterman [Wat95]) give an excellent overview and motivation for the problem. In this paper we show that the application of the so-called A algorithm, together with new s... |

155 |
The Multiple Sequence Alignment Problem in Biology
- Carillo, Lipman
- 1988
(Show Context)
Citation Context ...-Lipman bounding. Carillo and Lipman employ a different idea to reduce the number of vertices in the dynamic programming graph. The following property holds for any optimal multiple alignment A (c.f. =-=[CL88]-=-): 3 Theorem 2.1 (Carillo, Lipman) Let A be an optimal alignment of the K strings S 1 ; : : : ; SK , L := L(s ! t) be the lower bound defined in Equation 1 and U = c(A heur ) be an upper bound for c(A... |

153 |
Introduction to computational biology: maps, sequences and genomes, chapters 13,14
- Waterman
- 1995
(Show Context)
Citation Context ...g pairwise and multiple alignment, because by now a general methodology has been established and the three quite recently published monographs (Gusfield [Gus97], Setubal and Meidanis [SM97], Waterman =-=[Wat95]-=-) give an excellent overview and motivation for the problem. In this paper we show that the application of the so-called A algorithm, together with new strategies for computing better lower and upper ... |

62 | Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment
- Gupta, KececiogluJ, et al.
- 1995
(Show Context)
Citation Context ...t speeds up the computation of a shortest path by transforming the edge lengths without losing the optimality of the shortest path. We implemented the A algorithm in a computer program similar to MSA =-=[GKS95]-=- and FMA [SI97b]. We incorporated in this program new bounding strategies for both, lower and upper bounds and show that the A algorithm, together with our improvements, can speed up computations cons... |

58 | Comparative analysis of multiple protein-sequence alignment methods - McClure, Vasi, et al. - 1994 |

55 |
A platform for combinatorial and geometric computing
- LEDA
- 1999
(Show Context)
Citation Context ...the other method which does not consider faceinvalid nodes. 8 4 Computational results We implemented the described algorithm in C++ using the library of efficient data types and algorithms LEDA (c.f. =-=[MN95]-=-). Although this imposes a time and space overhead by a factor of 2 to 3 compared to ad hoc implementations it makes the software easy to read, to maintain, and to extend. Based on our implementation ... |

29 |
DCA: an efficient implementation of the divide-and conquer approach to simultaneous multiple sequence alignment
- Stoye, Moulton, et al.
- 1997
(Show Context)
Citation Context ... problem instances. Nevertheless exact algorithms are important, because they can be used as a last step of algorithms that use motif-search or divide-andconquer approaches. For example Stoye et al. (=-=[SMD97]) try in t-=-heir approach to divide the sequences at appropriate "slicing" locations which are determined through a branch-andbound procedure. The resulting subproblems are solved recursively. The recur... |

4 |
Setubal and Joao Meidanis. Introduction to Computational Molecular Biology
- Carlos
- 1997
(Show Context)
Citation Context ... papers concerning pairwise and multiple alignment, because by now a general methodology has been established and the three quite recently published monographs (Gusfield [Gus97], Setubal and Meidanis =-=[SM97]-=-, Waterman [Wat95]) give an excellent overview and motivation for the problem. In this paper we show that the application of the so-called A algorithm, together with new strategies for computing bette... |

3 |
New flexible approaches for multiple sequence alignment
- Shibuya, Imai
- 1997
(Show Context)
Citation Context ...computation of a shortest path by transforming the edge lengths without losing the optimality of the shortest path. We implemented the A algorithm in a computer program similar to MSA [GKS95] and FMA =-=[SI97b]-=-. We incorporated in this program new bounding strategies for both, lower and upper bounds and show that the A algorithm, together with our improvements, can speed up computations considerably. Additi... |

2 |
Strings, algorithms, and machine learning applications for computational biology
- Horton
- 1997
(Show Context)
Citation Context ...d. The above proof was first published in the master thesis of the first author ([Ler97]), however, the authors acknowledge that the above theorem has been shown independently by Horton and Lawler in =-=[Hor97]-=-. The U-bounding reduces the number of relevant nodes substantially. However, we still explore enough of the grid to guarantee that the computed alignment is optimal. Now we give up that guarantee in ... |

1 |
Msa 2.1 : A program for computing multiple alignments. source codes (http://www.ibc.wustl.edu/ibc/msa.html
- Kececioglu, Altschul, et al.
- 1994
(Show Context)
Citation Context ...which is feasible only for very small problem instances. While the SOP alignment problem is NP-complete, Kececioglu et al. presented in [GKS95] a branch-and-bound algorithm whose implementation (c.f. =-=[KAL94]-=-) -- called MSA in the sequel -- can optimally align some examples of six sequences of length 250 in a few minutes. Larger examples, however, require excessive space. In their approach, a heuristic al... |

1 |
Multiple sequence alignment
- Lermen
- 1997
(Show Context)
Citation Context ... ! t)sc(s ! q) + L(q ! t) =: Prio(q) If CL i;j (q) ? U , then Prio(q) ? U , so q is always U-invalid if it is CL-invalid. The above proof was first published in the master thesis of the first author (=-=[Ler97]-=-), however, the authors acknowledge that the above theorem has been shown independently by Horton and Lawler in [Hor97]. The U-bounding reduces the number of relevant nodes substantially. However, we ... |

1 |
Flexible multiple alignment program - version 0.34 alpha, suboptimal and parametrix analysis. Obtained by shibuya@is.s.utokyo. ac.jp
- Shibuya, Ikeda
- 1997
(Show Context)
Citation Context ...implementation with two other packages for optimal sequence alignment, namely the widely known program MSA in its latest version 2.1 [KAL94], and FMA, a very recent implementation by Shibuya and Imai =-=[SI97a]-=-, which also uses an A strategy. In order not to have an advantage against FMA or MSA, we use their two default cost matrices which are called dayhoff (MSA) and PAM250 (FMA). All examples were run on ... |