Results 1  10
of
82
NeighborNet: An agglomerative method for the construction of planar phylogenetic networks
"... We introduce NeighborNet, a network construction and data representation method that combines aspects of the neighbor joining (NJ) and SplitsTree. Like NJ, NeighborNet uses agglomeration: taxa are combined into progressively larger and larger overlapping clusters. Like SplitsTree, NeighborNet constr ..."
Abstract

Cited by 126 (6 self)
 Add to MetaCart
We introduce NeighborNet, a network construction and data representation method that combines aspects of the neighbor joining (NJ) and SplitsTree. Like NJ, NeighborNet uses agglomeration: taxa are combined into progressively larger and larger overlapping clusters. Like SplitsTree, NeighborNet constructs networks rather than trees, and so can be used to represent multiple phylogenetic hypotheses simultaneously, or to detect complex evolutionary processes like recombination, lateral transfer and hybridization. NeighborNet tends to produce networks that are substantially more resolved than those made with SplitsTree. The method is e#cient (O(n ) time) and is well suited for the preliminary analyses of complex phylogenetic data. We report results of three case studies: one based on mitochondrial gene order data from early branching eukaryotes, another based on nuclear sequence data from New Zealand alpine buttercups (Ranunculi), and a third on poorly corrected synthetic data.
Performance study of phylogenetic methods: (unweighted) quartet methods and neighborjoining
, 2003
"... ..."
Evolutionary Trees can be Learned in Polynomial Time in the TwoState General Markov Model
 SIAM Journal on Computing
, 1998
"... The jState General Markov Model of evolution (due to Steel) is a stochastic model concerned with the evolution of strings over an alphabet of size j . In particular, the TwoState General Markov Model of evolution generalises the wellknown CavenderFarrisNeyman model of evolution by removing the sy ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
The jState General Markov Model of evolution (due to Steel) is a stochastic model concerned with the evolution of strings over an alphabet of size j . In particular, the TwoState General Markov Model of evolution generalises the wellknown CavenderFarrisNeyman model of evolution by removing the symmetry restriction (which requires that the probability that a `0' turns into a `1' along an edge is the same as the probability that a `1' turns into a `0' along the edge). Farach and Kannan showed how to PAClearn Markov Evolutionary Trees in the CavenderFarrisNeyman model provided that the target tree satisfies the additional restriction that all pairs of leaves have a sufficiently high probability of being the same. We show how to remove both restrictions and thereby obtain the first polynomialtime PAClearning algorithm (in the sense of Kearns et al.) for the general class of TwoState Markov Evolutionary Trees. Research Report RR347, Department of Computer Science, University of Wa...
Learning Nonsingular Phylogenies and Hidden Markov Models
 Proceedings of the thirtyseventh annual ACM Symposium on Theory of computing, Baltimore (STOC05
, 2005
"... In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov m ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomialtime algorithm for learning nonsingular phylogenies and hidden Markov models.
Phase transitions in phylogeny
 Trans. Amer. Math. Soc
, 2003
"... Abstract. We apply the theory of Markov random fields on trees to derive a phase transition in the number of samples needed in order to reconstruct phylogenies. We consider the CavenderFarrisNeyman model of evolution on trees, where all the inner nodes have degree at least 3, and the net transitio ..."
Abstract

Cited by 23 (8 self)
 Add to MetaCart
Abstract. We apply the theory of Markov random fields on trees to derive a phase transition in the number of samples needed in order to reconstruct phylogenies. We consider the CavenderFarrisNeyman model of evolution on trees, where all the inner nodes have degree at least 3, and the net transition on each edge is bounded by ɛ. Motivated by a conjecture by M. Steel, we show that if 2(1 − 2ɛ) 2> 1, then for balanced trees, the topology of the underlying tree, having n leaves, can be reconstructed from O(log n) samples (characters) at the leaves. On the other hand, we show that if 2(1 − 2ɛ) 2 < 1, then there exist topologies which require at least n Ω(1) samples for reconstruction. Our results are the first rigorous results to establish the role of phase transitions for Markov random fields on trees, as studied in probability, statistical physics and information theory, for the study of phylogenies in mathematical biology. 1.
A Phase Transition for a Random Cluster Model on Phylogenetic Trees
, 2004
"... We investigate a simple model that generates random partitions of the leaf set of a tree. Of particular interest is the reconstruction question: what number k of independent samples (partitions) are required to correctly reconstruct the underlying tree (with high probability)? We demonstrate a phase ..."
Abstract

Cited by 22 (14 self)
 Add to MetaCart
We investigate a simple model that generates random partitions of the leaf set of a tree. Of particular interest is the reconstruction question: what number k of independent samples (partitions) are required to correctly reconstruct the underlying tree (with high probability)? We demonstrate a phase transition for k as a function of the mutation rate, from logarithmic to polynomial dependence on the size of the tree. We also describe a simple polynomialtime tree reconstruction algorithm that applies in the logarithmic region. This model and the associated reconstruction questions are motivated by a Markov model for genomic evolution in molecular biology.
Inverting Random Functions II: Explicit Bounds for Discrete Maximum Likelihood Estimation, with Applications
 SIAM J. Discr. Math
, 2002
"... In this paper we study inverting randomfunctions under the maximumlikelihood estimation (MLE) criterion in the discrete setting. In particular, we consider how many independent evaluations of the random function at a particular element of the domain are needed for reliable reconstruction of that ele ..."
Abstract

Cited by 21 (13 self)
 Add to MetaCart
In this paper we study inverting randomfunctions under the maximumlikelihood estimation (MLE) criterion in the discrete setting. In particular, we consider how many independent evaluations of the random function at a particular element of the domain are needed for reliable reconstruction of that element. We provide explicit upper and lower bounds for MLE, both in the nonparametric and parametric setting, and give applications to cointossing and phylogenetic tree reconstruction.
On the impossibility of reconstructing ancestral data and phylogenies
 J. Comput. Biol
, 2003
"... We prove that it is impossible to reconstruct ancestral data at the root of “deep ” phylogenetic trees with high mutation rates. Moreover, we prove that it is impossible to reconstruct the topology of “deep ” trees with high mutation rates from a number of characters smaller than a lowdegree polyno ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
We prove that it is impossible to reconstruct ancestral data at the root of “deep ” phylogenetic trees with high mutation rates. Moreover, we prove that it is impossible to reconstruct the topology of “deep ” trees with high mutation rates from a number of characters smaller than a lowdegree polynomial in the number of leaves. Our impossibility results hold for all reconstruction methods. The proofs apply tools from information theory and percolation theory. Key words: phylogeny, phase transitions, trees, ancestral data. 1.
Fast Recovery of Evolutionary Trees with Thousands of Nodes
 RECOMB
, 2001
"... We present a novel distancebased algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n 2 ) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1o(1)) probabi ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
We present a novel distancebased algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n 2 ) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1o(1)) probability from sequences with polynomial length in n. Moreover, for almost all trees, our algorithm achieves the same success probability on polylogarithmic sample sizes. The theoretical results are supported by simulation experiments involving trees with 500, 1895, and 3135 leaves. The topologies of the trees are recovered with high success from 2000 bp DNA sequences.
Learning Latent Tree Graphical Models
 J. of Machine Learning Research
, 2011
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using socalled information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighborjoining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We also present regularized versions of our algorithms that learn latent tree approximations of arbitrary distributions. We compare