Results 1 
9 of
9
A few logs suffice to build (almost) all trees (I)
 II. THEORETICAL COMPUTER SCIENCE
, 1999
"... A phylogenetic tree (also called an "evolutionary tree") is a leaflabelled tree which represents the evolutionary history for a set of species, and the construction of such trees is a fundamental problem in biology. Here we address the issue of how many sequence sites are required in order to recov ..."
Abstract

Cited by 101 (24 self)
 Add to MetaCart
A phylogenetic tree (also called an "evolutionary tree") is a leaflabelled tree which represents the evolutionary history for a set of species, and the construction of such trees is a fundamental problem in biology. Here we address the issue of how many sequence sites are required in order to recover the tree with high probability when the sites evolve under standard Markovstyle i.i.d. mutation models. We provide analytic upper and lower bounds for the required sequence length, by developing a new (and polynomial time) algorithm. In particular we show that when the mutation probabilities are bounded the required sequence length can grow surprisingly slowly (a power of log n) in the number n of sequences, for almost all trees.
An Empirical Comparison of Phylogenetic Methods on Chloroplast Gene Order Data in Campanulaceae
, 2000
"... The first heuristic for reconstructing phylogenetic trees from gene order data was introduced by Blanchette et al.. It sought to reconstruct the breakpoint phylogeny and was applied to a variety of datasets. We present a new heuristic for estimating the breakpoint phylogeny which, although not pol ..."
Abstract

Cited by 50 (20 self)
 Add to MetaCart
The first heuristic for reconstructing phylogenetic trees from gene order data was introduced by Blanchette et al.. It sought to reconstruct the breakpoint phylogeny and was applied to a variety of datasets. We present a new heuristic for estimating the breakpoint phylogeny which, although not polynomialtime, is much faster in practice than BPAnalysis. We use this heuristic to conduct a phylogenetic analysis of chloroplast genomes in the flowering plant family Campanulaceae. We also present and discuss the results of experimentation on this real dataset with three methods: our new method, BPAnalysis, and the neighborjoining method, using breakpoint distances, inversion distances, and inversion plus transposition distances. 1
A New Fast Heuristic for Computing the Breakpoint Phylogeny and Experimental Phylogenetic Analyses of Real and Synthetic Data
, 2000
"... The breakpoint phylogeny is an optimization problem proposed by Blanchette et al. for reconstructing evolutionary trees from gene order data. These same authors also developed and implemented BPAnalysis [3], a heuristic method (based upon solving many instances of the travelling salesman proble ..."
Abstract

Cited by 28 (17 self)
 Add to MetaCart
The breakpoint phylogeny is an optimization problem proposed by Blanchette et al. for reconstructing evolutionary trees from gene order data. These same authors also developed and implemented BPAnalysis [3], a heuristic method (based upon solving many instances of the travelling salesman problem) for estimating the breakpoint phylogeny. We present a new heuristic for this purpose; although not polynomialtime, our heuristic is much faster in practice than BPAnalysis. We present and discuss the results of experimentation on synthetic datasets and on the flowering plant family Campanulaceae with three methods: our new method, BPAnalysis, and the neighborjoining method [25] using several distance estimation techniques. Our preliminary results indicate that, on datasets with slow evolutionary rates and large numbers of genes in comparison with the number of taxa (genomes), all methods recover quite accurate reconstructions of the true evolutionary history (although BPAnal...
Learning Nonsingular Phylogenies and Hidden Markov Models
 Proceedings of the thirtyseventh annual ACM Symposium on Theory of computing, Baltimore (STOC05
, 2005
"... In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov m ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomialtime algorithm for learning nonsingular phylogenies and hidden Markov models.
Fast Recovery of Evolutionary Trees with Thousands of Nodes
 RECOMB
, 2001
"... We present a novel distancebased algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n 2 ) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1o(1)) probabi ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
We present a novel distancebased algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n 2 ) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1o(1)) probability from sequences with polynomial length in n. Moreover, for almost all trees, our algorithm achieves the same success probability on polylogarithmic sample sizes. The theoretical results are supported by simulation experiments involving trees with 500, 1895, and 3135 leaves. The topologies of the trees are recovered with high success from 2000 bp DNA sequences.
Hybrid Tree Reconstruction Methods
, 1998
"... A major computational problem in Biology is the reconstruction of evolutionary trees for species sets, and accuracy is measured by comparing the topologies of the reconstructed tree and the model tree. One of the major debates in the field is whether large evolutionary trees can be even approxima ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
A major computational problem in Biology is the reconstruction of evolutionary trees for species sets, and accuracy is measured by comparing the topologies of the reconstructed tree and the model tree. One of the major debates in the field is whether large evolutionary trees can be even approximately accurately reconstructed from biomolecular sequences of realistically bounded lengths (up to about 2000 nucleotides) using standard techniques (polynomial time distance methods, and heuristics for NPhard optimization problems). Using both analytical and experimental techniques, we show that on large trees, the two most popular methods in systematic biology, neighborjoining and maximum parsimony heuristics, as well as two promising methods introduced by theoretical computer scientists, are all likely to have significant errors in the topology reconstruction of the model tree. We also present a new general technique for combining outputs of different methods (thus producing hybri...
The Short Quartet Method
 International Congress on Automata, Languages and Programming
, 1998
"... . Reconstructing phylogenetic (evolutionary) trees is a major research problem in biology, but unfortunately the current methods are either inconsistent somewhere in the parameter space (and hence do not reconstruct the tree even given unboundedly long sequences), have poor statistical power (and he ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
. Reconstructing phylogenetic (evolutionary) trees is a major research problem in biology, but unfortunately the current methods are either inconsistent somewhere in the parameter space (and hence do not reconstruct the tree even given unboundedly long sequences), have poor statistical power (and hence require extremely long sequences on large or highly divergent trees), or have computational requirements that are excessive. We describe in this paper a new method, which we call the Short Quartet Method, for inferring evolutionary trees. The Short Quartet Method has great statistical power, is provably consistent throughout the parameter space, and uses only polynomial time. We present the results of experimental studies based upon simulations of sequence evolution that demonstrate its greater statistical power than neighborjoining [33], perhaps the most popular method for phylogenetic tree inference among molecular biologists. 1 Introduction The study of evolution is a fundamental p...
Fast Recovery of Evolutionary Trees With Thousands of
, 2002
"... We present a novel distancebased algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n ) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1 o(1)) pro ..."
Abstract
 Add to MetaCart
We present a novel distancebased algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n ) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1 o(1)) probability from sequences with polynomial length in n.