Results 1  10
of
38
Full reconstruction of Markov models on evolutionary trees: identifiability and consistency
 Math. Biosci
, 1996
"... A Markov model of evolution of characters on a phylogenetic tree consists of a tree topology together with a speci cation of probability transition matrices on the edges of the tree. Previous work has shown that under mild conditions, the tree topology may be reconstructed, in the sense that the top ..."
Abstract

Cited by 74 (0 self)
 Add to MetaCart
A Markov model of evolution of characters on a phylogenetic tree consists of a tree topology together with a speci cation of probability transition matrices on the edges of the tree. Previous work has shown that under mild conditions, the tree topology may be reconstructed, in the sense that the topology is identi able from knowledge of the joint distribution of character states at pairs of terminal nodes of the tree. Also, the method of maximum likelihood is statistically consistent for inferring the tree topology. In this paper we answer the analogous questions for reconstruction of the full model, including the edge transition matrices: under mild conditions, such full reconstruction is achievable, not by using pairs of terminal nodes, but rather by using triples of terminal nodes. The identi ability result generalizes previous results that were restricted either to characters having two states or to transition matrices having special structure. The proof develops matrix relationships that may be exploited to identify the model. We also use the identi ability result to prove that the method of maximum likelihood is consistent for reconstructing the full model. 1 markov models on evolutionary trees 2 1
Recovering a tree from the leaf colourations it generates under a Markov model
, 1994
"... We describe a simple transformation that allows for the fast recovery of a tree from the probabilities such a tree induces on the colourations of its leaves under a simple Markov process (with unknown parameters). This generalizes earlier results by not requiring the transition matrices associat ..."
Abstract

Cited by 59 (14 self)
 Add to MetaCart
We describe a simple transformation that allows for the fast recovery of a tree from the probabilities such a tree induces on the colourations of its leaves under a simple Markov process (with unknown parameters). This generalizes earlier results by not requiring the transition matrices associated with the edges of the tree to be of a particular form, or to be related by some fixed rate matrix, and by not insisting on a particular distribution of colours at the root of the tree. Applications to taxonomy are outlined briefly in three corollaries.
Toric ideals of phylogenetic invariants
 JOURNAL OF COMPUTATIONAL BIOLOGY
, 2005
"... Statistical models of evolution are algebraic varieties in the space of joint probability distributions on the leaf colorations of a phylogenetic tree. The phylogenetic invariants of a model are the polynomials which vanish on the variety. Several widely used models for biological sequences have tra ..."
Abstract

Cited by 52 (13 self)
 Add to MetaCart
Statistical models of evolution are algebraic varieties in the space of joint probability distributions on the leaf colorations of a phylogenetic tree. The phylogenetic invariants of a model are the polynomials which vanish on the variety. Several widely used models for biological sequences have transition matrices that can be diagonalized by means of the Fourier transform of an abelian group. Their phylogenetic invariants form a toric ideal in the Fourier coordinates. We determine minimal generators and Gröbner bases for these toric ideals. For the JukesCantor and Kimura models on a binary tree, our Gröbner basis consists of quadrics, cubics and quartics.
Tropical geometry of statistical models
 Proceedings of the National Academy of Sciences, 101:16132–16137
, 2004
"... This paper presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sumproduct algorit ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
This paper presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sumproduct algorithm is an efficient tool for evaluating specific coordinates. The question addressed here is how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. A key role is played by the Newton polytope of a statistical model. Our results are applied to the hidden Markov model and to the general Markov model on a binary tree. 1 Algebraic Statistics, Tropical Geometry, and Inference This paper presents a unified mathematical framework for probabilistic inference with statistical models, such as graphical models. Our approach is summarized as follows: (a) Statistical models are algebraic varieties. (b) Every algebraic variety can be tropicalized. (c) Tropicalized statistical models are fundamental for parametric inference. By a statistical model we mean a family of joint probability distributions for a collection of discrete
The identifiability of tree topology for phylogenetic models, including covarion and mixture models, arXive qbio.PE/0511009
"... Abstract. For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of p ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
Abstract. For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogenetic models, including a covarion model and a variety of mixture models with a limited number of classes. The proof is based on the introduction of a more general model, allowing more states at internal nodes of the tree than at leaves, and the study of the algebraic variety formed by the joint distributions to which it gives rise. Tree identifiability is first established for this general model through the use of certain phylogenetic invariants. 1.
On The Computational Complexity of Inferring Evolutionary Trees
, 1993
"... The process of reconstructing evolutionary trees can be viewed formally as an optimization problem. Recently, decision problems associated with the most commonly used approaches to reconstructing such trees have been shown to be NPcomplete [Day87, DJS86, DS86, DS87, GF82, Kri88, KM86]. In this t ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
The process of reconstructing evolutionary trees can be viewed formally as an optimization problem. Recently, decision problems associated with the most commonly used approaches to reconstructing such trees have been shown to be NPcomplete [Day87, DJS86, DS86, DS87, GF82, Kri88, KM86]. In this thesis, a framework is established that incorporates all such problems studied to date. Within this framework, the NPcompleteness results for decision problems are extended by applying theorems from [CT91, Gas86, GKR92, JVV86, KST89, Kre88, Sel91] to derive bounds on the computational complexity of several functions associated with each of these problems, namely ffl evaluation functions, which return the cost of the optimal tree(s), ffl solution functions, which return an optimal tree, ffl spanning functions, which return the number of optimal trees, ffl enumeration functions, which systematically enumerate all optimal trees, and ffl randomselection functions, which return a random...
Probability Models for Genome Rearrangement and Linear Invariants for Phylogenetic Inference
 In Proc. of COCOON
, 1999
"... We review the combinatorial optimization problems in calculating edit distances between genomes and phylogenetic inference based on minimizing gene order changes. With a view to avoiding the computational cost and the "long branches attract" artifact of some treebuilding methods, we explore the pro ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
We review the combinatorial optimization problems in calculating edit distances between genomes and phylogenetic inference based on minimizing gene order changes. With a view to avoiding the computational cost and the "long branches attract" artifact of some treebuilding methods, we explore the probabilization of genome rearrangment models prior to developing a methodology based on branchlength invariants. We characterize probabilistically the evolution of the structure of the gene adjacency set for inversions on unsigned circular genomes and, using a nontrivial recurrence relation, inversions on signed genomes. Concepts from the theory of invariants developed for the phylogenetics of homologous gene sequences can be used to derive a complete set of linear invariants for unsigned inversions, as well as for a mixed rearrangement model for signed genomes, though not for pure transposition nor pure signed inversion models. The invariants are based on an extended JukesCantor semigroup....
Geometry and the complexity of matrix multiplication
, 2007
"... Abstract. We survey results in algebraic complexity theory, focusing on matrix multiplication. Our goals are (i) to show how open questions in algebraic complexity theory are naturally posed as questions in geometry and representation theory, (ii) to motivate researchers to work on these questions, ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Abstract. We survey results in algebraic complexity theory, focusing on matrix multiplication. Our goals are (i) to show how open questions in algebraic complexity theory are naturally posed as questions in geometry and representation theory, (ii) to motivate researchers to work on these questions, and (iii) to point out relations with more general problems in geometry. The key geometric objects for our study are the secant varieties of Segre varieties. We explain how these varieties are also useful for algebraic statistics, the study of phylogenetic invariants, and quantum computing.