Results 1  10
of
23
Fast and Accurate Phylogeny Reconstruction Algorithms Based on the MinimumEvolution Principle
 JOURNAL OF COMPUTATIONAL BIOLOGY
, 2002
"... The Minimum Evolution (ME) approach to phylogeny estimation has been shown to be statistically consistent when it is used in conjunction with ordinary leastsquares (OLS) fitting of a metric to a tree structure. The traditional approach to using ME has been to start with the Neighbor Joining (NJ) to ..."
Abstract

Cited by 99 (8 self)
 Add to MetaCart
(Show Context)
The Minimum Evolution (ME) approach to phylogeny estimation has been shown to be statistically consistent when it is used in conjunction with ordinary leastsquares (OLS) fitting of a metric to a tree structure. The traditional approach to using ME has been to start with the Neighbor Joining (NJ) topology for a given matrix and then do a topological search from that starting point. The first stage requires O(n³) time, where n is the number of taxa, while the current implementations of the second are in O(p n³) or more, where p is the number of swaps performed by the program. In this paper, we examine a greedy approach to minimum evolution which produces a starting topology in O(n²) time. Moreover, we provide an algorithm that searches for the best topology using nearest neighbor interchanges (NNIs), where the cost of doing p NNIs is O(n² C p n), i.e., O(n²) in practice because p is always much smaller than n. The Greedy Minimum Evolution (GME) algorithm, when used in combination with NNIs, produces trees which are fairly close to NJ trees in terms of topological accuracy. We also examine ME under a balanced weighting scheme, where sibling subtrees have equal weight, as opposed to the standard “unweighted ” OLS, where
Fast Neighbor Joining
 In Proc. of the 32nd International Colloquium on Automata, Languages and Programming (ICALP’05
, 2005
"... Reconstructing the evolutionary history of a set of species is a fundamental problem in biology and methods for solving this problem are gaged based on two characteristics: accuracy and e#ciency. Neighbor Joining (NJ) is a socalled distancebased method that, thanks to its good accuracy and spe ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
(Show Context)
Reconstructing the evolutionary history of a set of species is a fundamental problem in biology and methods for solving this problem are gaged based on two characteristics: accuracy and e#ciency. Neighbor Joining (NJ) is a socalled distancebased method that, thanks to its good accuracy and speed, has been embraced by the phylogeny community.
On the complexity of distancebased evolutionary tree reconstruction
 In SODA: ACMSIAM Symposium on Discrete Algorithms
, 2003
"... y\Lambda ..."
(Show Context)
Fast and reliable reconstruction of phylogenetic trees with very short edges
 In SODA: ACMSIAM Symposium on Discrete Algorithms
, 2008
"... Phylogenetic reconstruction is the problem of reconstructing an evolutionary tree from sequences corresponding to leaves of that tree. A central goal in phylogenetic reconstruction is to be able to reconstruct the tree as accurately as possible from as short as possible input sequences. The sequence ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
(Show Context)
Phylogenetic reconstruction is the problem of reconstructing an evolutionary tree from sequences corresponding to leaves of that tree. A central goal in phylogenetic reconstruction is to be able to reconstruct the tree as accurately as possible from as short as possible input sequences. The sequence length required for correct topological reconstruction depends on certain properties of the tree, such as its depth and minimal edgeweight. Fast converging reconstruction algorithms are considered stateof theart in this sense, as they require asymptotically minimal sequence length in order to guarantee (with high probability) correct topological reconstruction of the entire tree. However, when the original phylogenetic tree contains very short edges, this minimal sequencelength is still too long for practical purposes. Short
Phylogenies without branch bounds: Contracting the short, pruning the deep
, 2009
"... We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently lo ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently long and 2) sufficiently close to the leaves. How much of the true tree is recovered depends on the sequence length provided. The algorithm is distancebased and runs in polynomial time. 1
AlignmentFree Phylogenetic Reconstruction
, 2009
"... We introduce the first polynomialtime phylogenetic reconstruction algorithm under a model of sequence evolution allowing insertions and deletions—or indels. Given appropriate assumptions, our algorithm requires sequence lengths growing polynomially in the number of leaf taxa. Our techniques are dis ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
We introduce the first polynomialtime phylogenetic reconstruction algorithm under a model of sequence evolution allowing insertions and deletions—or indels. Given appropriate assumptions, our algorithm requires sequence lengths growing polynomially in the number of leaf taxa. Our techniques are distancebased and largely bypass the problem of multiple alignment.
The accuracy of fast phylogenetic methods for large datasets
 IN PROC. 7TH PACIFIC SYMP. ON BIOCOMPUTING (PSB02
, 2002
"... Wholegenome phylogenetic studies require various sources of phylogenetic signals to produce an accurate picture of the evolutionary history of a group of genomes. In particular, sequencebased reconstruction will play an important role, especially in resolving more recent events. But using sequence ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Wholegenome phylogenetic studies require various sources of phylogenetic signals to produce an accurate picture of the evolutionary history of a group of genomes. In particular, sequencebased reconstruction will play an important role, especially in resolving more recent events. But using sequences at the level of whole genomes means working with very large amounts of data—large numbers of sequences—as well as large phylogenetic distances, so that reconstruction methods must be both fast and robust as well as accurate. We study the accuracy, convergence rate, and speed of several fast reconstruction methods: neighborjoining, Weighbor (a weighted version of neighborjoining), greedy parsimony, and a new phylogenetic reconstruction method based on diskcovering and parsimony search (DCMNJ+MP). Our study uses extensive simulations based on random birthdeath trees, with controlled deviations from ultrametricity. We find that Weighbor, thanks to its sophisticated handling of probabilities, outperforms other methods for short sequences, while our new method is the best choice for sequence lengths above 100. For very large sequence lengths, all four methods have similar accuracy, so that the speed of neighborjoining and greedy parsimony makes them the two methods of choice.
Largescale multiple sequence alignment and phylogeny estimation
, 2013
"... With the advent of next generation sequencing technologies, alignment and phylogeny estimation of datasets with thousands of sequences is being attempted. To address these challenges, new algorithmic approaches have been developed that have been able to provide substantial improvements over standard ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
With the advent of next generation sequencing technologies, alignment and phylogeny estimation of datasets with thousands of sequences is being attempted. To address these challenges, new algorithmic approaches have been developed that have been able to provide substantial improvements over standard methods. This paper focuses on new approaches for ultralarge tree estimation, including methods for coestimation of alignments and trees, estimating trees without needing a full sequence alignment, and phylogenetic placement. While the main focus is on methods with empirical performance advantages, we also discuss the theoretical guarantees of methods under Markov models of evolution. Finally, we include a discussion of the future of largescale phylogenetic analysis.
Designing fast converging phylogenetic methods
 IN PROC. 9TH INT’L CONF. ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY (ISMB’01), VOLUME 17 OF BIOINFORMATICS
, 2001
"... Absolute fast converging phylogenetic reconstruction methods are provably guaranteed to recover the true tree with high probability from sequences that grow only polynomially in the number of leaves, once the edge lengths are bounded arbitrarily from above and below. Only a few methods have been de ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Absolute fast converging phylogenetic reconstruction methods are provably guaranteed to recover the true tree with high probability from sequences that grow only polynomially in the number of leaves, once the edge lengths are bounded arbitrarily from above and below. Only a few methods have been determined to be absolute fast converging; these have all been developed in just the last few years, and most are polynomial time. In this paper, we compare preexisting fast converging methods as well as some new polynomial time methods that we have developed. Our study, based upon simulating evolution under a wide range of model conditions, establishes that our new methods outperform both neighbor joining and the previous fast converging methods, returning very accurate large trees, when these other methods do poorly.
SequenceLength Requirements for Phylogenetic Methods
, 2002
"... We study the sequence lengths required by neighborjoining, greedy parsimony, and a phylogenetic reconstruction method (DCM NJ +MP) based on diskcovering and the maximum parsimony criterion. We use extensive simulations based on random birthdeath trees, with controlled deviations from ultrametr ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
We study the sequence lengths required by neighborjoining, greedy parsimony, and a phylogenetic reconstruction method (DCM NJ +MP) based on diskcovering and the maximum parsimony criterion. We use extensive simulations based on random birthdeath trees, with controlled deviations from ultrametricity, to collect data on the scaling of sequencelength requirements for each of the three methods as a function of the number of taxa, the rate of evolution on the tree, and the deviation from ultrametricity. Our experiments show that DCM NJ +MP has consistently lower sequencelength requirements than the other two methods when trees of high topological accuracy are desired, although all methods require much longer sequences as the deviation from ultrametricity or the height of the tree grows. Our study has significant implications for largescale phylogenetic reconstruction (where sequencelength requirements are a crucial factor), but also for future performance analyses in phylogenetics (since deviations from ultrametricity are proving pivotal).