Results 1  10
of
84
Full reconstruction of Markov models on evolutionary trees: identifiability and consistency
 Math. Biosci
, 1996
"... A Markov model of evolution of characters on a phylogenetic tree consists of a tree topology together with a speci cation of probability transition matrices on the edges of the tree. Previous work has shown that under mild conditions, the tree topology may be reconstructed, in the sense that the top ..."
Abstract

Cited by 112 (0 self)
 Add to MetaCart
(Show Context)
A Markov model of evolution of characters on a phylogenetic tree consists of a tree topology together with a speci cation of probability transition matrices on the edges of the tree. Previous work has shown that under mild conditions, the tree topology may be reconstructed, in the sense that the topology is identi able from knowledge of the joint distribution of character states at pairs of terminal nodes of the tree. Also, the method of maximum likelihood is statistically consistent for inferring the tree topology. In this paper we answer the analogous questions for reconstruction of the full model, including the edge transition matrices: under mild conditions, such full reconstruction is achievable, not by using pairs of terminal nodes, but rather by using triples of terminal nodes. The identi ability result generalizes previous results that were restricted either to characters having two states or to transition matrices having special structure. The proof develops matrix relationships that may be exploited to identify the model. We also use the identi ability result to prove that the method of maximum likelihood is consistent for reconstructing the full model. 1 markov models on evolutionary trees 2 1
Toric ideals of phylogenetic invariants
 JOURNAL OF COMPUTATIONAL BIOLOGY
, 2005
"... Statistical models of evolution are algebraic varieties in the space of joint probability distributions on the leaf colorations of a phylogenetic tree. The phylogenetic invariants of a model are the polynomials which vanish on the variety. Several widely used models for biological sequences have tra ..."
Abstract

Cited by 83 (15 self)
 Add to MetaCart
(Show Context)
Statistical models of evolution are algebraic varieties in the space of joint probability distributions on the leaf colorations of a phylogenetic tree. The phylogenetic invariants of a model are the polynomials which vanish on the variety. Several widely used models for biological sequences have transition matrices that can be diagonalized by means of the Fourier transform of an abelian group. Their phylogenetic invariants form a toric ideal in the Fourier coordinates. We determine minimal generators and Gröbner bases for these toric ideals. For the JukesCantor and Kimura models on a binary tree, our Gröbner basis consists of quadrics, cubics and quartics.
Recovering a tree from the leaf colourations it generates under a Markov model
, 1994
"... We describe a simple transformation that allows for the fast recovery of a tree from the probabilities such a tree induces on the colourations of its leaves under a simple Markov process (with unknown parameters). This generalizes earlier results by not requiring the transition matrices associat ..."
Abstract

Cited by 67 (8 self)
 Add to MetaCart
We describe a simple transformation that allows for the fast recovery of a tree from the probabilities such a tree induces on the colourations of its leaves under a simple Markov process (with unknown parameters). This generalizes earlier results by not requiring the transition matrices associated with the edges of the tree to be of a particular form, or to be related by some fixed rate matrix, and by not insisting on a particular distribution of colours at the root of the tree. Applications to taxonomy are outlined briefly in three corollaries.
Spectral Analysis of Phylogenetic Data
 JOURNAL OF CLASSIFICATION 10:524 (1993)
, 1993
"... The spectral analysis of sequence and distance data is a new approach to phylogenetic analysis. For twostate character sequences, the character values at a given site split the set of taxa into two subsets, a bipartition of the taxa set. The vector which counts the relative numbers of each of the ..."
Abstract

Cited by 54 (7 self)
 Add to MetaCart
The spectral analysis of sequence and distance data is a new approach to phylogenetic analysis. For twostate character sequences, the character values at a given site split the set of taxa into two subsets, a bipartition of the taxa set. The vector which counts the relative numbers of each of these bipartitions over all sites is called a sequence spectrtmt. Applying a transformation called a Hadamard conjugation, the sequence spectrum is transformed to the conjugate spectrum. This conjugation corrects for unobserved changes in the data, independently from the choice of phylogenetic tree. For any given phylogenetic tree with edge weights (probabilities of state change), we define a corresponding tree spectrum. The selection of a weighted phylogenetic tree from the given sequence data is made by matching the conjugate speclrum with a tree spectrum. We develop an optimality selection procedure using a least squares best fit, to find the phylogenetic tree whose tree spectrum most closely matches the conjugate spectrum. An inferred sequence spectrum can be derived from the selected tree spectrum using the inverse Hadamard conjugation to allow a comparison with the original sequence spectrum.
Tropical geometry of statistical models
 Proceedings of the National Academy of Sciences, 101:16132–16137
, 2004
"... This paper presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sumproduct algorit ..."
Abstract

Cited by 50 (5 self)
 Add to MetaCart
(Show Context)
This paper presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sumproduct algorithm is an efficient tool for evaluating specific coordinates. The question addressed here is how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. A key role is played by the Newton polytope of a statistical model. Our results are applied to the hidden Markov model and to the general Markov model on a binary tree. 1 Algebraic Statistics, Tropical Geometry, and Inference This paper presents a unified mathematical framework for probabilistic inference with statistical models, such as graphical models. Our approach is summarized as follows: (a) Statistical models are algebraic varieties. (b) Every algebraic variety can be tropicalized. (c) Tropicalized statistical models are fundamental for parametric inference. By a statistical model we mean a family of joint probability distributions for a collection of discrete
The identifiability of tree topology for phylogenetic models, including covarion and mixture models
, 2005
"... For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogene ..."
Abstract

Cited by 45 (13 self)
 Add to MetaCart
For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogenetic models, including a covarion model and a variety of mixture models with a limited number of classes. The proof is based on the introduction of a more general model, allowing more states at internal nodes of the tree than at leaves, and the study of the algebraic variety formed by the joint distributions to which it gives rise. Tree identifiability is first established for this general model through the use of certain phylogenetic invariants.
Phylogenomics and the reconstruction of the tree of life
 Nat Rev Genet
, 2005
"... As more complete genomes are sequenced, phylogenetic analysis is entering a new era — that of phylogenomics. One branch of this expanding field aims to reconstruct the evolutionary history of organisms based on the analysis of their genomes. Recent studies have demonstrated the power of this approac ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
As more complete genomes are sequenced, phylogenetic analysis is entering a new era — that of phylogenomics. One branch of this expanding field aims to reconstruct the evolutionary history of organisms based on the analysis of their genomes. Recent studies have demonstrated the power of this approach, which has the potential to provide answers to a number of fundamental evolutionary questions. However, challenges for the future have also been revealed. The very nature of the evolutionary history of organisms and the limitations of current phylogenetic reconstruction methods mean that part of the tree of life halsde00193293, version 1 3 Dec 2007 may prove difficult, if not impossible, to resolve with confidence. Introductory paragraph Understanding phylogenetic relationships between organisms is a prerequisite of almost any evolutionary study, as contemporary species all share a common history through their ancestry. The notion of phylogeny follows directly from the theory of evolution presented by Charles Darwin in “The Origin of Species ” 1: the only illustration in his famous book is the first representation of evolutionary relationships among species, in the form of a
Geometry and the complexity of matrix multiplication
, 2007
"... Abstract. We survey results in algebraic complexity theory, focusing on matrix multiplication. Our goals are (i) to show how open questions in algebraic complexity theory are naturally posed as questions in geometry and representation theory, (ii) to motivate researchers to work on these questions, ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
(Show Context)
Abstract. We survey results in algebraic complexity theory, focusing on matrix multiplication. Our goals are (i) to show how open questions in algebraic complexity theory are naturally posed as questions in geometry and representation theory, (ii) to motivate researchers to work on these questions, and (iii) to point out relations with more general problems in geometry. The key geometric objects for our study are the secant varieties of Segre varieties. We explain how these varieties are also useful for algebraic statistics, the study of phylogenetic invariants, and quantum computing.