## A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by . . . (2003)

Citations: | 849 - 14 self |

### BibTeX

@MISC{Guindon03asimple,,

author = {Stephane Guindon and Olivier Gascuel},

title = {A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by . . .},

year = {2003}

}

### Years of Citing Articles

### OpenURL

### Abstract

The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment of the topology and branch lengths, only a few iterations are sufficient to reach an optimum. We used extensive and realistic computer simulations to show that the topological accuracy of this new method is at least as high as that of the existing maximum-likelihood programs and much higher than the performance of distance-based and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximum-likelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting of 500 rbcL sequences with 1,428 base pairs from plant plastids, thus reaching a speed of the same order as some popular distance-based and parsimony algorithms. This new method is implemented in the PHYML program, which is freely available on our web page: http://www.lirmm.fr/w3ifa/MAAS/. [Algorithm; computer simulations; maximum likelihood; phylogeny; rbcL; RDPII project.] The size of homologous sequence data sets has increased dramatically in recent years, and many of these data sets now involve several hundreds of taxa. Moreover, current probabilist...