Results 1 - 10
of
17
The Identifiability of Covarion Models in phylogenetics
, 2008
"... Covarion models of character evolution describe inhomogeneities in substitution processes through time. In phylogenetics, such models are used to describe changing functional constraints or selection regimes during the evolution of biological sequences. In this work the identifiability of such mode ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Covarion models of character evolution describe inhomogeneities in substitution processes through time. In phylogenetics, such models are used to describe changing functional constraints or selection regimes during the evolution of biological sequences. In this work the identifiability of such models for generic parameters on a known phylogenetic tree is established, provided the number of covarion classes does not exceed the size of the observable state space. Combined with earlier results, this implies both the tree and generic numerical parameters are identifiable if the number of classes is strictly smaller than the number of observable states.
Identifiability of 2-tree mixtures for group-based models
, 2009
"... Phylogenetic data arising on two possibly different tree topologies might be mixed through several biological mechanisms, including incomplete lineage sorting or horizontal gene transfer in the case of different topologies, or simply different substitution processes on characters in the case of the ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Phylogenetic data arising on two possibly different tree topologies might be mixed through several biological mechanisms, including incomplete lineage sorting or horizontal gene transfer in the case of different topologies, or simply different substitution processes on characters in the case of the same topology. Recent work on a 2-state symmetric model of character change showed such a mixture model has nonidentifiable parameters, and thus it is theoretically impossible to determine the two tree topologies from any amount of data under such circumstances. Here the question of identifiability is investigated for 2-tree mixtures of the 4-state group-based models, which are more relevant to DNA sequence data. Using algebraic techniques, we show that the tree parameters are identifiable for the JC and K2P models. We also prove that generic substitution parameters for the JC mixture models are identifiable, and for the K2P and K3P models obtain generic identifiability results for mixtures on the same tree. This indicates that the full phylogenetic signal remains in such mixtures, and that the 2-state symmetric result is thus a misleading guide to the behavior of other models.
A basic limitation on inferring phylogenies by pairwise sequence comparisons
- J THEOR
"... Distance-based approaches in phylogenetics such as Neighbor-Joining are a fast and popular approach for building trees. These methods take pairs of sequences from them construct a value that, in expectation, is additive under a stochastic model of site substitution. Most models assume a distributi ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Distance-based approaches in phylogenetics such as Neighbor-Joining are a fast and popular approach for building trees. These methods take pairs of sequences from them construct a value that, in expectation, is additive under a stochastic model of site substitution. Most models assume a distribution of rates across sites, often based on a gamma distribution. Provided the (shape) parameter of this distribution is known, the method can correctly reconstruct the tree. However, if the shape parameter is not known then we show that topologically different trees, with different shape parameters and associated positive branch lengths, can lead to exactly matching distributions on pairwise site patterns between all pairs of taxa. Thus, one could not distinguish between the two trees using pairs of sequences without some prior knowledge of the shape parameter. More surprisingly, this can happen for any choice of distinct shape parameters on the two trees, and thus the result is not peculiar to a particular or contrived selection of the shape parameters. On a positive note, we point out known conditions where identifiability can be restored (namely, when the branch lengths are clocklike, or if methods such as maximum likelihood are used).
First-Order Correct Bootstrap Support Adjustments for Splits that Allow Hypothesis Testing When Using Maximum Likelihood Estimation
"... The most frequent measure of phylogenetic uncertainty for splits is bootstrap support. Although large bootstrap support intuitively suggests that a split in a tree is well supported, it has not been clear how large bootstrap support needs to be to conclude that there is significant evidence that a h ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The most frequent measure of phylogenetic uncertainty for splits is bootstrap support. Although large bootstrap support intuitively suggests that a split in a tree is well supported, it has not been clear how large bootstrap support needs to be to conclude that there is significant evidence that a hypothesized split is present. Indeed, recent work has shown that bootstrap support is not first-order correct and thus cannot be directly used for hypothesis testing. We present methods that adjust bootstrap support values in a maximum likelihood (ML) setting so that they have an interpretationcorresponding to P values in conventionalhypothesis testing; for instance, adjusted bootstrap support larger than 95 % occurs only 5 % of the time if the split is not present. Throughexamples and simulation settings, it is found that adjustments always increase the level of support. We also find that the nature of the adjustment is fairly constant across parameter settings. Finally, we consider adjustments that take into account the data-dependent nature of many hypotheses about splits: the hypothesis that they are present is being tested because they are in the tree estimated through ML. Here, in contrast, we find that bootstrap probability often needs to be adjusted downwards. Key words:maximum likelihood, topology test, bootstrap support, splits. Research article
Identifiability of 3-class Jukes-Cantor mixtures
, 2014
"... We prove identifiability of the tree parameters of the 3-class Jukes-Cantor mixture model. The proof uses ideas from algebraic statistics, in particular: finding phylogenetic invari-ants that separate the varieties associated to different triples of trees; computing dimensions of the resulting phy ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We prove identifiability of the tree parameters of the 3-class Jukes-Cantor mixture model. The proof uses ideas from algebraic statistics, in particular: finding phylogenetic invari-ants that separate the varieties associated to different triples of trees; computing dimensions of the resulting phylogenetic varieties; and using the disentangling number to reduce to trees with a small number of leaves. Symbolic computation also plays a key role in handling the many different cases and finding relevant phylogenetic invariants.
PHYLOGENETIC MIXTURES: CONCENTRATION OF MEASURE IN THE LARGE-TREE LIMIT
, 2012
"... ..."
(Show Context)
Identifiability and inference of non-parametric rates-across-sites models on large-scale phylogenies
, 2013
"... ..."