Results 1  10
of
26
Diskcovering, a fastconverging method for phylogenetic tree reconstruction
 JOURNAL OF COMPUTATIONAL BIOLOGY
, 1999
"... The evolutionary history of a set of species is represented by a phylogenetic tree, which is a rooted, leaflabeled tree, where internal nodes represent ancestral species and the leaves represent modern day species. Accurate (or even boundedly inaccurate) topology reconstructions of large and diverg ..."
Abstract

Cited by 95 (10 self)
 Add to MetaCart
(Show Context)
The evolutionary history of a set of species is represented by a phylogenetic tree, which is a rooted, leaflabeled tree, where internal nodes represent ancestral species and the leaves represent modern day species. Accurate (or even boundedly inaccurate) topology reconstructions of large and divergent trees from realistic length sequences have long been considered one of the major challenges in systematic biology. In this paper, we present a simple method, the DiskCovering Method (DCM), which boosts the performance of base phylogenetic methods under various Markov models of evolution. We analyze the performance of DCMboosted distance methods under the Jukes–Cantor Markov model of biomolecular sequence evolution, and prove that for almost all trees, polylogarithmic length sequences suffice for complete accuracy with high probability, while polynomial length sequences always suffice. We also provide an experimental study based upon simulating sequence evolution on model trees. This study confirms substantial reductions in error rates at realistic sequence lengths.
Inferring Evolutionary Trees with Strong Combinatorial Evidence
 THEORETICAL COMPUTER SCIENCE
, 1997
"... We consider the problem of inferring the evolutionary tree of a set of n species. We propose a quartet reconstruction method which specifically produces trees whose edges have strong combinatorial evidence. Let Q be a set of resolved quartets defined on the studied species, the method computes th ..."
Abstract

Cited by 78 (14 self)
 Add to MetaCart
We consider the problem of inferring the evolutionary tree of a set of n species. We propose a quartet reconstruction method which specifically produces trees whose edges have strong combinatorial evidence. Let Q be a set of resolved quartets defined on the studied species, the method computes the unique maximum subset Q of Q which is equivalent to a tree and outputs the corresponding tree as an estimate of the species' phylogeny. We use a characterization of the subset Q due to [6] to provide an O(n 4 ) incremental algorithm for this variant of the NPhard quartet consistency problem. Moreover, when chosing the resolution of the quartets by the FourPoint Method (FPM) and considering the CavenderFarris model of evolution, we show that the convergence rate of the Q method is at worst polynomial when the maximum evolutive distance between two species is bounded. We complete these theoretical results by an experimental study on real and simulated data sets. The results ...
Efficient Algorithms for Inverting Evolution
 Proceedings of the ACM Symposium on the Foundations of Computer Science
, 1999
"... Evolution can be mathematically modelled by a stochastic process that operates on the DNA of species. Such models are based on the established theory that the DNA sequences, or genomes, of all extant species have been derived from the genome of the common ancestor of all species by a process of rand ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
(Show Context)
Evolution can be mathematically modelled by a stochastic process that operates on the DNA of species. Such models are based on the established theory that the DNA sequences, or genomes, of all extant species have been derived from the genome of the common ancestor of all species by a process of random mutation and natural selection. A stochastic model...
Learning Nonsingular Phylogenies and Hidden Markov Models
 Proceedings of the thirtyseventh annual ACM Symposium on Theory of computing, Baltimore (STOC05
, 2005
"... In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov m ..."
Abstract

Cited by 42 (7 self)
 Add to MetaCart
In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomialtime algorithm for learning nonsingular phylogenies and hidden Markov models.
Evolutionary Trees can be Learned in Polynomial Time in the TwoState General Markov Model
 SIAM Journal on Computing
, 1998
"... The jState General Markov Model of evolution (due to Steel) is a stochastic model concerned with the evolution of strings over an alphabet of size j . In particular, the TwoState General Markov Model of evolution generalises the wellknown CavenderFarrisNeyman model of evolution by removing the sy ..."
Abstract

Cited by 35 (2 self)
 Add to MetaCart
(Show Context)
The jState General Markov Model of evolution (due to Steel) is a stochastic model concerned with the evolution of strings over an alphabet of size j . In particular, the TwoState General Markov Model of evolution generalises the wellknown CavenderFarrisNeyman model of evolution by removing the symmetry restriction (which requires that the probability that a `0' turns into a `1' along an edge is the same as the probability that a `1' turns into a `0' along the edge). Farach and Kannan showed how to PAClearn Markov Evolutionary Trees in the CavenderFarrisNeyman model provided that the target tree satisfies the additional restriction that all pairs of leaves have a sufficiently high probability of being the same. We show how to remove both restrictions and thereby obtain the first polynomialtime PAClearning algorithm (in the sense of Kearns et al.) for the general class of TwoState Markov Evolutionary Trees. Research Report RR347, Department of Computer Science, University of Wa...
Obtaining Highly Accurate Topology Estimates of Evolutionary Trees From Very Short Sequences
 Proceedings of RECOMB'99
"... The evolutionary history of a set of species is represented by a phylogenetic tree, in other words, by a rooted, leaflabelled tree, where internal nodes represent ancestral species and the leaves represent modern day species. Accurate (or even boundedly inaccurate) topology reconstructions of large ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
The evolutionary history of a set of species is represented by a phylogenetic tree, in other words, by a rooted, leaflabelled tree, where internal nodes represent ancestral species and the leaves represent modern day species. Accurate (or even boundedly inaccurate) topology reconstructions of large and divergent trees has long been considered one of the major challenges in systematic biology. None of the polynomial time methods developed by the theoretical computer science community has been shown to outperform the popular NeighborJoining method used by systematic biologists, with respect to topology estimation. (However, preliminary experiments indicate that two new variants of NeighborJoining, BioNJ and Weighbor, do exhibit improved performance.) In this paper, we present a simple polynomial time method, the DiskCovering Method (DCM), which boosts the performance of base phylogenetic methods. We analyze the performance of DCMboosted distance methods under the general Markov mo...
Phylogenies without branch bounds: Contracting the short, pruning the deep
, 2009
"... We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently lo ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently long and 2) sufficiently close to the leaves. How much of the true tree is recovered depends on the sequence length provided. The algorithm is distancebased and runs in polynomial time. 1
Inverting random functions
 Annals of Combinatorics
, 1999
"... In this paper we study how to invert random functions under different criteria. The motivation for this study is phylogeny reconstruction, since the evolution of biomolecular sequences may be considered as a random function from the set of possible phylogenetic trees to the set of collections of bio ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
(Show Context)
In this paper we study how to invert random functions under different criteria. The motivation for this study is phylogeny reconstruction, since the evolution of biomolecular sequences may be considered as a random function from the set of possible phylogenetic trees to the set of collections of biomolecular sequences of observed species. Our results may effect how we think about the maximum likelihood estimation (MLE) in phylogeny. MLE is optimal to invert random functions under a first criterion, although it is not optimal under another, at least equally natural but more conservative second criterion. It turns out that MLE has to be used in a different way as it is used in the phylogeny literature, if we have a prior distribution on trees and mutation mechanisms and want to keep MLE optimal under the same first criterion. Some of the results of this paper have been known in the setting of statistical decision theory, but have never been discussed in the context of phylogeny. ∗Michael A. Steel was supported by the New Zealand Marsden Fund. László A.Székely
Better Methods for Solving Parsimony and Compatibility
 Journal of Computational Biology
, 1998
"... Evolutionary tree reconstruction is a challenging problem with important applications in Biology and Linguistics. In Biology, one of the most promising approaches to tree reconstruction is to find the "maximum parsimony" tree, while in Linguistics, the use of the "maximum ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
Evolutionary tree reconstruction is a challenging problem with important applications in Biology and Linguistics. In Biology, one of the most promising approaches to tree reconstruction is to find the "maximum parsimony" tree, while in Linguistics, the use of the "maximum