Results 1  10
of
30
Learning Nonsingular Phylogenies and Hidden Markov Models
 Proceedings of the thirtyseventh annual ACM Symposium on Theory of computing, Baltimore (STOC05
, 2005
"... In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov m ..."
Abstract

Cited by 42 (7 self)
 Add to MetaCart
(Show Context)
In this paper, we study the problem of learning phylogenies and hidden Markov models. We call the Markov model nonsingular if all transtion matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomialtime algorithm for learning nonsingular phylogenies and hidden Markov models.
A genome phylogeny for mitochondria among αproteobacteria and a predominantly eubacterial ancestry of yeast nuclear genes
, 2004
"... ..."
Phase transitions in phylogeny
 Trans. Amer. Math. Soc
, 2003
"... Abstract. We apply the theory of Markov random fields on trees to derive a phase transition in the number of samples needed in order to reconstruct phylogenies. We consider the CavenderFarrisNeyman model of evolution on trees, where all the inner nodes have degree at least 3, and the net transitio ..."
Abstract

Cited by 33 (8 self)
 Add to MetaCart
(Show Context)
Abstract. We apply the theory of Markov random fields on trees to derive a phase transition in the number of samples needed in order to reconstruct phylogenies. We consider the CavenderFarrisNeyman model of evolution on trees, where all the inner nodes have degree at least 3, and the net transition on each edge is bounded by ɛ. Motivated by a conjecture by M. Steel, we show that if 2(1 − 2ɛ) 2> 1, then for balanced trees, the topology of the underlying tree, having n leaves, can be reconstructed from O(log n) samples (characters) at the leaves. On the other hand, we show that if 2(1 − 2ɛ) 2 < 1, then there exist topologies which require at least n Ω(1) samples for reconstruction. Our results are the first rigorous results to establish the role of phase transitions for Markov random fields on trees, as studied in probability, statistical physics and information theory, for the study of phylogenies in mathematical biology. 1.
distorted metrics on trees and phylogenetic forests
 IEEE/ACM Transactions on computational biology and bioinformatics
"... We study distorted metrics on binary trees in the context of phylogenetic reconstruction. Given a binary tree T on n leaves with a path metric d, consider the pairwise distances {d(u, v)} between leaves. It is well known that these determine the tree and the d length of all edges. Here we consider d ..."
Abstract

Cited by 29 (11 self)
 Add to MetaCart
(Show Context)
We study distorted metrics on binary trees in the context of phylogenetic reconstruction. Given a binary tree T on n leaves with a path metric d, consider the pairwise distances {d(u, v)} between leaves. It is well known that these determine the tree and the d length of all edges. Here we consider distortions ˆ d of d such that for all leaves u and v it holds that d(u, v) − ˆ d(u, v)  < f/2 if either d(u, v) < M or ˆ d(u, v) < M, where d satisfies f ≤ d(e) ≤ g for all edges e. Given such distortions we show how to reconstruct in polynomial time a forest T1,...,Tα such that the true tree T may be obtained from that forest by adding α − 1 edges and α − 1 ≤ 2 −Ω(M/g) n. Metric distortions arise naturally in phylogeny, where d(u, v) is defined by the logdet of a covariance matrix associated with u and v. When u and v are “far”, the entries of the covariance matrix are small and therefore ˆ d(u, v), which is defined by logdet of an associated empiricalcorrelation matrix may be a bad estimate of d(u, v) even if the correlation matrix is “close ” to the covariance matrix. Our metric results are used in order to show how to reconstruct phylogenetic forests with small number of trees from sequences of length logarithmic in the size of the tree. Our method also yields an independent proof that phylogenetic trees can be reconstructed in polynomial time from sequences of polynomial length under the standard assumptions in phylogeny. Both the metric result and its applications to phylogeny are almost tight. 1
Robust Reconstruction on Trees is Determined By the Second Eigenvalue
, 2002
"... Consider information propagation from the root of infinite Bary tree, where each edge of the tree acts as an independent copy of some channel M . The reconstruction problem is solvable, if the n'th level of the tree contains a nonvanishing amount of information on the root of the tree, as ..."
Abstract

Cited by 27 (8 self)
 Add to MetaCart
(Show Context)
Consider information propagation from the root of infinite Bary tree, where each edge of the tree acts as an independent copy of some channel M . The reconstruction problem is solvable, if the n'th level of the tree contains a nonvanishing amount of information on the root of the tree, as n # #.
A Phase Transition for a Random Cluster Model on Phylogenetic Trees
, 2004
"... We investigate a simple model that generates random partitions of the leaf set of a tree. Of particular interest is the reconstruction question: what number k of independent samples (partitions) are required to correctly reconstruct the underlying tree (with high probability)? We demonstrate a phase ..."
Abstract

Cited by 27 (13 self)
 Add to MetaCart
We investigate a simple model that generates random partitions of the leaf set of a tree. Of particular interest is the reconstruction question: what number k of independent samples (partitions) are required to correctly reconstruct the underlying tree (with high probability)? We demonstrate a phase transition for k as a function of the mutation rate, from logarithmic to polynomial dependence on the size of the tree. We also describe a simple polynomialtime tree reconstruction algorithm that applies in the logarithmic region. This model and the associated reconstruction questions are motivated by a Markov model for genomic evolution in molecular biology.
Inferring ancestral sequences in taxonrich phylogenies
 Math. Biosci
, 2010
"... ar ..."
(Show Context)
* Corresponding author.
, 2002
"... We describe a Bayesian estimation and inference procedure for fMRI time series based on the use of General Linear Models with Autoregressive (AR) error processes. We make use of the Variational Bayesian (VB) framework which approximates the true posterior density with a factorised density. The fidel ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
We describe a Bayesian estimation and inference procedure for fMRI time series based on the use of General Linear Models with Autoregressive (AR) error processes. We make use of the Variational Bayesian (VB) framework which approximates the true posterior density with a factorised density. The fidelity of this approximation is verified via Gibbs sampling. The VB approach provides a natural extension to previous Bayesian analyses which have used Empirical Bayes. VB has the advantage of taking into account the variability of hyperparameter estimates with little additional computational effort. Further, VB allows for automatic selection of the order of the AR process. Results are shown on simulated data and on data from an eventrelated fMRI experiment.
FAST PHYLOGENY RECONSTRUCTION THROUGH LEARNING OF ANCESTRAL SEQUENCES
, 812
"... Abstract. Given natural limitations on the length DNA sequences, designing phylogenetic reconstruction methods which are reliable under limited information is a crucial endeavor. There have been two approaches to this problem: reconstructing partial but reliable information about the tree ([18, 7, 5 ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Given natural limitations on the length DNA sequences, designing phylogenetic reconstruction methods which are reliable under limited information is a crucial endeavor. There have been two approaches to this problem: reconstructing partial but reliable information about the tree ([18, 7, 5, 13]), and reaching ”deeper ” in the tree through reconstruction of ancestral sequences. In the latter category, [6] settled an important conjecture of M.Steel, showing that, under the CFN model of evolution, all trees on n leaves with edge lengths bounded by the Ising model phase transition can be recovered with high probability from genomes of length O(log n) with a polynomial time algorithm. Their methods had a running time of O(n 10). Here we enhance our methods from [5] with the learning of ancestral sequences and provide an algorithm for reconstructing a subforest of the tree which is reliable given available data, without requiring apriori known bounds on the edge lengths of the tree. Our methods are based on an intuitive minimum spanning tree approach and run in O(n 3) time. For the case of full reconstruction of trees with edges under the phase transition, we maintain the same sequence length requirements as [6], despite the considerably faster running time. Key words and phrases. Phylogenetic reconstruction, Ising model, phase transitions, phylogenetic forests, information flow, ancestral sequence reconstruction.
Predicting the Ancestral Character Changes in a Tree is Typically Easier than Predicting the Root State
, 2013
"... Abstract.—Predicting the ancestral sequences of a group of homologous sequences related by a phylogenetic tree has been the subject of many studies, and numerous methods have been proposed for this purpose. Theoretical results are available that show that when the substitution rates become too large ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract.—Predicting the ancestral sequences of a group of homologous sequences related by a phylogenetic tree has been the subject of many studies, and numerous methods have been proposed for this purpose. Theoretical results are available that show that when the substitution rates become too large, reconstructing the ancestral state at the tree root is no longer feasible. Here, we also study the reconstruction of the ancestral changes that occurred along the tree edges. We show that, depending on the tree and branch length distribution, reconstructing these changes (i.e., reconstructing the ancestral state of all internal nodes in the tree) may be easier or harder than reconstructing the ancestral root state. However, results from information theory indicate that for the standard Yule tree, the task of reconstructing internal node states remains feasible, even for very high substitution rates. Moreover, computer simulations demonstrate that for more complex trees and scenarios, this result still holds. For a large variety of counting, parsimony and likelihoodbased methods, the predictive accuracy of a randomly selected internal node in the tree is indeed much higher than the accuracy of the same method when applied to the tree root. Moreover, parsimony and likelihoodbased methods appear to be remarkably robust to sampling bias and model misspecification. [Ancestral state prediction; character evolution; majority rule; Markov model; maximum likelihood; parsimony; phylogenetic tree.]