Results 1  10
of
213
A few logs suffice to build (almost) all trees (I)
 II. THEORETICAL COMPUTER SCIENCE
, 1999
"... A phylogenetic tree (also called an "evolutionary tree") is a leaflabelled tree which represents the evolutionary history for a set of species, and the construction of such trees is a fundamental problem in biology. Here we address the issue of how many sequence sites are required in order to recov ..."
Abstract

Cited by 101 (24 self)
 Add to MetaCart
A phylogenetic tree (also called an "evolutionary tree") is a leaflabelled tree which represents the evolutionary history for a set of species, and the construction of such trees is a fundamental problem in biology. Here we address the issue of how many sequence sites are required in order to recover the tree with high probability when the sites evolve under standard Markovstyle i.i.d. mutation models. We provide analytic upper and lower bounds for the required sequence length, by developing a new (and polynomial time) algorithm. In particular we show that when the mutation probabilities are bounded the required sequence length can grow surprisingly slowly (a power of log n) in the number n of sequences, for almost all trees.
Gascuel O. Approximate LikelihoodRatio Test for Branches: A
 Fast, Accurate, and Powerful Alternative. Systematic Biology
"... Abstract.—We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihoodratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLR ..."
Abstract

Cited by 86 (4 self)
 Add to MetaCart
Abstract.—We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihoodratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the 1 2 1 2 χ 2 0 + χ
Full reconstruction of Markov models on evolutionary trees: identifiability and consistency
 Math. Biosci
, 1996
"... A Markov model of evolution of characters on a phylogenetic tree consists of a tree topology together with a speci cation of probability transition matrices on the edges of the tree. Previous work has shown that under mild conditions, the tree topology may be reconstructed, in the sense that the top ..."
Abstract

Cited by 74 (0 self)
 Add to MetaCart
A Markov model of evolution of characters on a phylogenetic tree consists of a tree topology together with a speci cation of probability transition matrices on the edges of the tree. Previous work has shown that under mild conditions, the tree topology may be reconstructed, in the sense that the topology is identi able from knowledge of the joint distribution of character states at pairs of terminal nodes of the tree. Also, the method of maximum likelihood is statistically consistent for inferring the tree topology. In this paper we answer the analogous questions for reconstruction of the full model, including the edge transition matrices: under mild conditions, such full reconstruction is achievable, not by using pairs of terminal nodes, but rather by using triples of terminal nodes. The identi ability result generalizes previous results that were restricted either to characters having two states or to transition matrices having special structure. The proof develops matrix relationships that may be exploited to identify the model. We also use the identi ability result to prove that the method of maximum likelihood is consistent for reconstructing the full model. 1 markov models on evolutionary trees 2 1
Phylogenetic Tree Construction Using Markov Chain Monte Carlo
, 1999
"... We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the obs ..."
Abstract

Cited by 58 (0 self)
 Add to MetaCart
We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the observed sequences. Our algorithm strikes a reasonable balance between the desire to move globally through the space of phylogenies and the need to make computationally feasible moves in areas of high probability. Since phylogenetic information is described by a tree, we have created new diagnostics to handle this type of data structure. An important byproduct of the Markov chain Monte Carlo phylogeny building technique is that it provides estimates and corresponding measures of variability for any aspect of the phylogeny under study.
Protein phylogenies and signature sequences: evolutionary relationships within prokaryotes and between prokaryotes and eukaryotes. Antonie Leeuwenhoek 72:49–61
, 1997
"... Updated information and services can be found at: ..."
Efficient Algorithms for Inverting Evolution
 Proceedings of the ACM Symposium on the Foundations of Computer Science
, 1999
"... Evolution can be mathematically modelled by a stochastic process that operates on the DNA of species. Such models are based on the established theory that the DNA sequences, or genomes, of all extant species have been derived from the genome of the common ancestor of all species by a process of rand ..."
Abstract

Cited by 47 (3 self)
 Add to MetaCart
Evolution can be mathematically modelled by a stochastic process that operates on the DNA of species. Such models are based on the established theory that the DNA sequences, or genomes, of all extant species have been derived from the genome of the common ancestor of all species by a process of random mutation and natural selection. A stochastic model...
Discordance of species trees with their most likely gene trees. PLoS Genetics 2(5)e68
 Journal of Bioinformatics and Computational Biology
, 2006
"... Because of the stochastic way in which lineages sort during speciation, gene trees may differ in topology from each other and from species trees. Surprisingly, assuming that genetic lineages follow a coalescent model of withinspecies evolution, we find that for any species tree topology with five o ..."
Abstract

Cited by 43 (12 self)
 Add to MetaCart
Because of the stochastic way in which lineages sort during speciation, gene trees may differ in topology from each other and from species trees. Surprisingly, assuming that genetic lineages follow a coalescent model of withinspecies evolution, we find that for any species tree topology with five or more species, there exist branch lengths for which gene tree discordance is so common that the most likely gene tree topology to evolve along the branches of a species tree differs from the species phylogeny. This counterintuitive result implies that in combining data on multiple loci, the straightforward procedure of using the most frequently observed gene tree topology as an estimate of the species tree topology can be asymptotically guaranteed to produce an incorrect estimate. We conclude with suggestions that can aid in overcoming this new obstacle to accurate genomic inference of species phylogenies. Citation: Degnan JH, Rosenberg NA (2006) Discordance of species trees with their most likely gene trees. PLoS Genet 2(5): e68. DOI: 10.1371/journal.pgen.0020068
Constructing Big Trees from Short Sequences
 Proceedings of the 24th International Colloquium on Automata, Languages, and Programming
, 1997
"... . The construction of evolutionary trees is a fundamental problem in biology, and yet methods for reconstructing evolutionary trees are not reliable when it comes to inferring accurate topologies of large divergent evolutionary trees from realistic length sequences. We address this problem and prese ..."
Abstract

Cited by 34 (7 self)
 Add to MetaCart
. The construction of evolutionary trees is a fundamental problem in biology, and yet methods for reconstructing evolutionary trees are not reliable when it comes to inferring accurate topologies of large divergent evolutionary trees from realistic length sequences. We address this problem and present a new polynomial time algorithm for reconstructing evolutionary trees called the Short Quartets Method which is consistent and which has greater statistical power than other polynomial time methods, such as NeighborJoining and the 3approximation algorithm by Agarwala et al. (and the "Double Pivot" variant of the Agarwala et al. algorithm by Cohen and Farach) for the L1nearest tree problem. Our study indicates that our method will produce the correct topology from shorter sequences than can be guaranteed using these other methods. 1 Introduction Evolutionary trees indicate how species evolved from a common ancestor and are of fundamental concern to biologists. There are many methods f...
General TimeReversible Distances with Unequal Rates across Sites: Mixing Γ and Inverse Gaussian Distributions with Invariant Sites
, 1997
"... This paper aims to explain to biologists the assumptions of these distances and to clarify some earlier misconceptions. Importantly, nearly all of the currently used distance estimates (including those of Tamura, 1992; Tamura and Nei, 1994) are special cases (restrictions) of the general timerevers ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
This paper aims to explain to biologists the assumptions of these distances and to clarify some earlier misconceptions. Importantly, nearly all of the currently used distance estimates (including those of Tamura, 1992; Tamura and Nei, 1994) are special cases (restrictions) of the general timereversible distance (see Zharkikh, 1994; Swofford et al., 1996)
Learning Nonsingular Phylogenies and Hidden Markov Models
 Proceedings of the thirtyseventh annual ACM Symposium on Theory of computing, Baltimore (STOC05
, 2005
"... In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov m ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomialtime algorithm for learning nonsingular phylogenies and hidden Markov models.