Results 1 - 10
of
61
Approximate likelihood ratio test for branches: a fast, accurate and powerful alternative
- SYSTEMATIC BIOLOGY
, 2006
"... We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based ..."
Abstract
-
Cited by 275 (9 self)
- Add to MetaCart
We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the 1 2 1 2 χ 2 0 + χ
Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models
- SYST. BIOL
, 2004
"... What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th>.it the mo ..."
Abstract
-
Cited by 101 (7 self)
- Add to MetaCart
What does die posterior probability of a phylogenetic tree mean? This simulation study shows that Bayesian posterior probabilities have the meaning that is typically ascribed to them; the pt>sterkir probability ot'a tree is the probability that the tree is corwct, assuming th>.it the model is correct. At the same time, the BayL-sian method can be sensitive to model misspecification, and the sensitivity of the Bayesian method appears to be greater than the sensitivity ot " the nonparametric bootstrap method (using maximum likelihood to estimate trees). Although the estimatL-s of phylogeny obtained by use of the method of maximum likelihood or the Bayesian method are Ukely to be similar, the assessment of the uncertainty of inferred trees via either bootstriipping (t"or maximum likelihood estimates) or petsterior probabilities (for Bayesian estimates) is not likely to be the same. We suggest that the Bayesian method be implemented with the most complex models of those currently avaiiable, as tliis should reduce the chance that the metliod will concentrate too much probability on tuo few trees. [Bayesian estimation; Markov ch^iin Monte Carlo; posterior probability; prior probability.] Quantify ing the uncertainty of a phylogcneticesti mil te is at least as important a goal as obtaining the phylogenetic estimate itself. Measures of phylogenetic reliability not only point out what parts of a tree can be trusted when interpreting the evolution of a group, but can guide
K (2005) Polytomies and Bayesian phylogenetic inference. Syst Biol 54
"... 1 Abstract — Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There is, however, a growing number of examples in which large Bayesian posterior ..."
Abstract
-
Cited by 60 (0 self)
- Add to MetaCart
1 Abstract — Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There is, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as non-parametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible
The Importance of Proper Model Assumption in Bayesian Phylogenetics
, 2004
"... We studied the importance of proper model assumption in the context of Bayesian phylogenetics by examining>5,000 Bayesian analyses and six nested models of nucleotide substitution. Model misspecification can strongly bias bipartition posterior probability estimates. These biases were most pronou ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
(Show Context)
We studied the importance of proper model assumption in the context of Bayesian phylogenetics by examining>5,000 Bayesian analyses and six nested models of nucleotide substitution. Model misspecification can strongly bias bipartition posterior probability estimates. These biases were most pronounced when rate heterogeneity was ignored. The type of bias seen at a particular bipartition appeared to be strongly influenced by the lengths of the branches surrounding that bipartition. In the Felsenstein zone, posterior probability estimates of bipartitions were biased when the assumed model was underparameterized but were unbiased when the assumed model was overparameterized. For the inverse Felsenstein zone, however, both underparameterization and overparameterization led to biased bipartition posterior probabilities, although the bias caused by overparameterization was less pronounced and disappeared with increased sequence length. Model parameter estimates were also affected by model misspecification. Underparameterization caused a bias in some parameter estimates, such as branch lengths and the gamma shape parameter, whereas overparameterization caused a decrease in the precision of some parameter estimates. We caution researchers to assure that the most appropriate model is assumed by employing both a priori model choice methods and a posteriori model adequacy tests. [Bayesian phylogenetic inference; convergence; Markov chain Monte Carlo; maximum likelihood; model choice; posterior probability.] Model choice is becoming a critical issue as the number of available models of nucleotide evolution increases rapidly. Recent studies have shown that adequate
Profiling Phylogenetic Informativeness
, 2007
"... The resolution of four controversial topics in phylogenetic experimental design hinges upon the informativeness of characters about the historical relationships among taxa. These controversies regard the power of different classes of phylogenetic character, the relative utility of increased taxonom ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
The resolution of four controversial topics in phylogenetic experimental design hinges upon the informativeness of characters about the historical relationships among taxa. These controversies regard the power of different classes of phylogenetic character, the relative utility of increased taxonomic versus character sampling, the differentiation between lack of phylogenetic signal and a historical rapid radiation, and the design of taxonomically broad phylogenetic studies optimized by taxonomically sparse genome-scale data. Quantification of the informativeness of characters for resolution of phylogenetic hypotheses during specified historical epochs is key to the resolution of these controversies. Here, such a measure of phylogenetic informativeness is formulated. The optimal rate of evolution of a character to resolve a dated four-taxon polytomy is derived. By scaling the asymptotic informativeness of a character evolving at a nonoptimal rate by the derived asymptotic optimum, and by normalizing so that net phylogenetic informativeness is equivalent for all rates when integrated across all of history, an informativeness profile across history is derived. Calculation of the informativeness per base pair allows estimation of the cost-effectiveness of character sampling. Calculation of the informativeness per million years allows comparison across historical radiations of the utility of a gene for the inference of rapid adaptive radiation. The theory is applied to profile the phylogenetic informativeness of the genes BRCA1, RAG1, GHR, and c-myc from a muroid rodent sequence data set. Bounded integrations of the phylogenetic profile of these genes over four epochs comprising the diversifications of the muroid rodents, the mammals, the lobe-limbed vertebrates, and the early metazoans demonstrate the
Data partitions and complex models in Bayesian analysis: the phylogeny of gymnophthalmid lizards
- Syst. Biol
, 2004
"... Abstract.—Phylogenetic studies incorporating multiple loci, and multiple genomes, are becoming increasingly common. Coincident with this trend in genetic sampling, model-based likelihood techniques including Bayesian phylogenetic methods continue to gain popularity. Few studies, however, have examin ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
(Show Context)
Abstract.—Phylogenetic studies incorporating multiple loci, and multiple genomes, are becoming increasingly common. Coincident with this trend in genetic sampling, model-based likelihood techniques including Bayesian phylogenetic methods continue to gain popularity. Few studies, however, have examined model fit and sensitivity to such potentially heterogeneous data partitions within combined data analyses using empirical data. Here we investigate the relative model fit and sensitivity of Bayesian phylogenetic methods when alternative site-specific partitions of among-site rate variation (with and without autocorrelated rates) are considered. Our primary goal in choosing a best-fit model was to employ the simplest model that was a good fit to the data while optimizing topology and/or Bayesian posterior probabilities. Thus, we were not interested in complex models that did not practically affect our interpretation of the topology under study. We applied these alternative models to a four-gene data set including one protein-coding nuclear gene (c-mos), one protein-coding mitochondrial gene (ND4), and two mitochondrial rRNA genes (12S and 16S) for the diverse yet poorly known lizard family Gymnophthalmidae. Our results suggest that the best-fit model partitioned among-site rate variation separately among the c-mos, ND4, and 12S + 16S gene regions. We found this model yielded identical topologies to those from analyses based on the GTR+I+G model, but significantly changed posterior probability estimates of clade support. This partitioned model also produced more precise (less variable) estimates of posterior probabilities across generations of long Bayesian runs, compared to runs employing a GTR+I+G model estimated for the combined data. We use this three-way gamma partitioning in Bayesian analyses to
Analysis and visualization of tree space
- Systematic Biology
, 2005
"... Abstract.—We explored the use of multidimensional scaling (MDS) of tree-to-tree pairwise distances to visualize the re-lationships among sets of phylogenetic trees. We found the technique to be useful for exploring “tree islands ” (sets of topologically related trees among larger sets of near-optima ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
(Show Context)
Abstract.—We explored the use of multidimensional scaling (MDS) of tree-to-tree pairwise distances to visualize the re-lationships among sets of phylogenetic trees. We found the technique to be useful for exploring “tree islands ” (sets of topologically related trees among larger sets of near-optimal trees), for comparing sets of trees obtained from bootstrapping and Bayesian sampling, for comparing trees obtained from the analysis of several different genes, and for comparing mul-tiple Bayesian analyses. The technique was also useful as a teaching aid for illustrating the progress of a Bayesian analysis and as an exploratory tool for examining large sets of phylogenetic trees. We also identified some limitations to the method, including distortions of the multidimensional tree space into two dimensions through the MDS technique, and the defini-tion of the MDS-defined space based on a limited sample of trees. Nonetheless, the technique is a useful approach for the analysis of large sets of phylogenetic trees. [Bayesian analysis; multidimensional scaling; phylogenetic analysis; tree space; visualization.] Systematists are often faced with the need to analyze a large collection of phylogenetic trees. These trees may represent a collection of equally parsimonious solutions to a phylogenetic problem, or a set of trees of similar likelihood, or a sampled set of trees from a Markov
The Effect of Ambiguous Data on Phylogenetic Estimates Obtained by Maximum Likelihood and Bayesian Inference
, 2009
"... Abstract.—Although an increasing number of phylogenetic data sets are incomplete, the effect of ambiguous data on phylogenetic accuracy is not well understood. We use 4-taxon simulations to study the effects of ambiguous data (i.e., missing characters or gaps) in maximum likelihood (ML) and Bayesian ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
(Show Context)
Abstract.—Although an increasing number of phylogenetic data sets are incomplete, the effect of ambiguous data on phylogenetic accuracy is not well understood. We use 4-taxon simulations to study the effects of ambiguous data (i.e., missing characters or gaps) in maximum likelihood (ML) and Bayesian frameworks. By introducing ambiguous data in a way that removes confounding factors, we provide the first clear understanding of 1 mechanism by which ambiguous data can mislead phylogenetic analyses. We find that in both ML and Bayesian frameworks, among-site rate variation can interact with ambiguous data to produce misleading estimates of topology and branch lengths. Furthermore, within a Bayesian framework, priors on branch lengths and rate heterogeneity parameters can exacerbate the effects of ambiguous data, resulting in strongly misleading bipartition posterior probabilities. The magnitude and direction of the ambiguous data bias are a function of the number and taxonomic distribution of ambiguous characters, the strength of topological support, and whether or not the model is correctly specified. The results of this study have major implications for all analyses that rely on accurate estimates of topology or branch lengths, including divergence time estimation, ancestral state reconstruction, tree-dependent comparative methods, rate variation analysis, phylogenetic hypothesis testing, and phylogeographic
Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence
- Systematic Biology
, 2008
"... This page lists questions we have about your paper. The numbers displayed at left can be found in the text of the paper for reference. In addition, please review your paper as a whole for correctness. Q1: Au: Bazinet and Cummings. Any update? Q2: Au: Should the paragraph beginning with “Finally,.. ” ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
(Show Context)
This page lists questions we have about your paper. The numbers displayed at left can be found in the text of the paper for reference. In addition, please review your paper as a whole for correctness. Q1: Au: Bazinet and Cummings. Any update? Q2: Au: Should the paragraph beginning with “Finally,.. ” be number 5 of the numbered list above?
Bayesian mixed models and the phylogeny of pitvipers (Viperidae:
- Serpentes). Mol. Phylogenet. Evol.
, 2006
"... Abstract The subfamily Crotalinae (pitvipers) contains over 190 species of venomous snakes distributed in both the Old and New World. We incorporated an extensive sampling of taxa (including 28 of 29 genera), and sequences of four mitochondrial gene fragments (2.3 kb) per individual, to estimate th ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
(Show Context)
Abstract The subfamily Crotalinae (pitvipers) contains over 190 species of venomous snakes distributed in both the Old and New World. We incorporated an extensive sampling of taxa (including 28 of 29 genera), and sequences of four mitochondrial gene fragments (2.3 kb) per individual, to estimate the phylogeny of pitvipers based on maximum parsimony and Bayesian phylogenetic methods. Our Bayesian analyses incorporated complex mixed models of nucleotide evolution that allocated independent models to various partitions of the dataset within combined analyses. We compared results of unpartitioned versus partitioned Bayesian analyses to investigate how much unpartitioned (versus partitioned) models were forced to compromise estimates of model parameters, and whether complex models substantially alter phylogenetic conclusions to the extent that they appear to extract more phylogenetic signal than simple models. Our results indicate that complex models do extract more phylogenetic signal from the data. We also address how diVerences in phylogenetic results (e.g., bipartition posterior probabilities) obtained from simple versus complex models may be interpreted in terms of relative credibility. Our estimates of pitviper phylogeny suggest that nearly all recently proposed generic reallocations appear valid, although certain Old and New World genera (Ovophis, Trimeresurus, and Bothrops) remain poly-or paraphyletic and require further taxonomic revision. While a majority of nodes were resolved, we could not conWdently estimate the basal relationships among New World genera and which lineage of Old World species is most closely related to this New World group.