Results 1  10
of
136
S: PhyloBayes 3. A Bayesian software package for phylogenetic reconstruction and molecular dating
 Bioinformatics
"... Motivation: A variety of probabilistic models describing the evolution of DNA or protein sequences have been proposed for phylogenetic reconstruction or for molecular dating. However, there still lacks a common implementation allowing one to freely combine these independent features, so as to test t ..."
Abstract

Cited by 187 (8 self)
 Add to MetaCart
(Show Context)
Motivation: A variety of probabilistic models describing the evolution of DNA or protein sequences have been proposed for phylogenetic reconstruction or for molecular dating. However, there still lacks a common implementation allowing one to freely combine these independent features, so as to test their ability to jointly improve phylogenetic and dating accuracy. Results: We propose a software package, PhyloBayes 3, which can be used for conducting Bayesian phylogenetic reconstruction and molecular dating analyses, using a large variety of amino acid replacement and nucleotide substitution models, including empirical mixtures or nonparametric models, as well as alternative clock relaxation processes. Availability: PhyloBayes is freely available from our web site
Bayesian estimation of ancestral character states on phylogenies
 Syst. Biol
, 2004
"... Abstract.—Biologists frequently attempt to infer the character states at ancestral nodes of a phylogeny from the distribution of traits observed in contemporary organisms. Because phylogenies are normally inferences from data, it is desirable to account for the uncertainty in estimates of the tree a ..."
Abstract

Cited by 170 (4 self)
 Add to MetaCart
(Show Context)
Abstract.—Biologists frequently attempt to infer the character states at ancestral nodes of a phylogeny from the distribution of traits observed in contemporary organisms. Because phylogenies are normally inferences from data, it is desirable to account for the uncertainty in estimates of the tree and its branch lengths when making inferences about ancestral states or other comparative parameters. Here we present a general Bayesian approach for testing comparative hypotheses across statistically justified samples of phylogenies, focusing on the specific issue of reconstructing ancestral states. The method uses Markov chain Monte Carlo techniques for sampling phylogenetic trees and for investigating the parameters of a statistical model of trait evolution. We describe how to combine information about the uncertainty of the phylogeny with uncertainty in the estimate of the ancestral state. Our approach does not constrain the sample of trees only to those that contain the ancestral node or nodes of interest, and we show how to reconstruct ancestral states of uncertain nodes using a mostrecentcommonancestor approach. We illustrate the methods with data on ribonuclease evolution in the Artiodactyla. Software implementing the methods (BayesMultiState) is available from the authors. [Ancestral states; comparative methods; maximum likelihood; MCMC; phylogeny.] Given a collection of species, information on their attributes, and a phylogeny that describes their shared hierarchy of descent, the prospect is raised of inferring the
H: Computing Bayes factors using thermodynamic integration
 Syst Biol
"... Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean ..."
Abstract

Cited by 112 (7 self)
 Add to MetaCart
(Show Context)
Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present article, we propose to employ another method, based on an analogy with statistical physics, called thermodynamic integration. We describe the method, propose an implementation, and show on two analytical examples that this numerical method yields reliable estimates. In contrast, the harmonic mean estimator leads to a strong overestimation of the marginal likelihood, which is all the more pronounced as the model is higher dimensional. As a result, the harmonic mean estimator systematically favors more parameterrich models, an artefact that might explain some recent puzzling observations, based on harmonic mean estimates, suggesting that Bayes factors tend to overscore complex models. Finally, we apply our method to the comparison of several alternative models of aminoacid replacement. We confirm our previous observations, indicating that modeling pattern heterogeneity across sites tends to yield better models than standard empirical matrices. [Bayes factor; harmonic mean; mixture model; path sampling; phylogeny; thermodynamic integration.] Bayesian methods have become popular in molecular phylogenetics over the recent years. The simple and intuitive interpretation of the concept of probabilities
M: Predicting functional gene links from phylogeneticstatistical analyses of whole genomes
 PLoS Comput Biol
"... An important element of the developing field of proteomics is to understand proteinprotein interactions and other functional links amongst genes. Acrossspecies correlation methods for detecting functional links work on the premise that functionally linked proteins will tend to show a common patter ..."
Abstract

Cited by 57 (4 self)
 Add to MetaCart
(Show Context)
An important element of the developing field of proteomics is to understand proteinprotein interactions and other functional links amongst genes. Acrossspecies correlation methods for detecting functional links work on the premise that functionally linked proteins will tend to show a common pattern of presence and absence across a range of genomes. We describe a maximum likelihood statistical model for predicting functional gene linkages. The method detects independent instances of the correlated gain or loss of pairs of proteins on phylogenetic trees, reducing the high rates of false positives observed in conventional acrossspecies methods that do not explicitly incorporate a phylogeny. We show, in a dataset of 10,551 protein pairs, that the phylogenetic method improves by up to 35 % on acrossspecies analyses at identifying known functionally linked proteins. The method shows that protein pairs with at least two to three correlated events of gain or loss are almost certainly functionally linked. Contingent evolution, in which one gene’s presence or absence depends upon the presence of another, can also be detected phylogenetically, and may identify genes whose functional significance depends upon its interaction with other genes. Incorporating phylogenetic information improves the prediction of functional linkages. The improvement derives from having a lower rate of false positives and from detecting trends that acrossspecies analyses miss. Phylogenetic methods can easily be incorporated into the screening of largescale bioinformatics datasets to identify sets of protein links and to characterise gene networks. Citation: Barker D, Pagel M (2005) Predicting functional gene links from phylogeneticstatistical analyses of whole genomes. PLoS Comp Biol 1(1): e3.
Phylogenomics and the reconstruction of the tree of life
 Nat Rev Genet
, 2005
"... As more complete genomes are sequenced, phylogenetic analysis is entering a new era — that of phylogenomics. One branch of this expanding field aims to reconstruct the evolutionary history of organisms based on the analysis of their genomes. Recent studies have demonstrated the power of this approac ..."
Abstract

Cited by 54 (2 self)
 Add to MetaCart
As more complete genomes are sequenced, phylogenetic analysis is entering a new era — that of phylogenomics. One branch of this expanding field aims to reconstruct the evolutionary history of organisms based on the analysis of their genomes. Recent studies have demonstrated the power of this approach, which has the potential to provide answers to a number of fundamental evolutionary questions. However, challenges for the future have also been revealed. The very nature of the evolutionary history of organisms and the limitations of current phylogenetic reconstruction methods mean that part of the tree of life halsde00193293, version 1 3 Dec 2007 may prove difficult, if not impossible, to resolve with confidence. Introductory paragraph Understanding phylogenetic relationships between organisms is a prerequisite of almost any evolutionary study, as contemporary species all share a common history through their ancestry. The notion of phylogeny follows directly from the theory of evolution presented by Charles Darwin in “The Origin of Species ” 1: the only illustration in his famous book is the first representation of evolutionary relationships among species, in the form of a
Heterotachy and longbranch attraction in phylogenetics.
 BMC Evol. Biol.
, 2005
"... ..."
(Show Context)
The identifiability of tree topology for phylogenetic models, including covarion and mixture models
, 2005
"... For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogene ..."
Abstract

Cited by 42 (12 self)
 Add to MetaCart
For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogenetic models, including a covarion model and a variety of mixture models with a limited number of classes. The proof is based on the introduction of a more general model, allowing more states at internal nodes of the tree than at leaves, and the study of the algebraic variety formed by the joint distributions to which it gives rise. Tree identifiability is first established for this general model through the use of certain phylogenetic invariants.
Site interdependence attributed to tertiary structure in amino acid sequence evolution. Gene
, 2005
"... Abstract Standard likelihoodbased frameworks in phylogenetics consider the process of evolution of a sequence site by site. Assuming that sites evolve independently greatly simplifies the required calculations. However, this simplification is known to be incorrect in many cases. Here, a computatio ..."
Abstract

Cited by 41 (11 self)
 Add to MetaCart
Abstract Standard likelihoodbased frameworks in phylogenetics consider the process of evolution of a sequence site by site. Assuming that sites evolve independently greatly simplifies the required calculations. However, this simplification is known to be incorrect in many cases. Here, a computational method that allows for general dependence between sites of a sequence is investigated. Using this method, measures acting as sequence fitness proxies can be considered over a phylogenetic tree. In this work, a set of statistically derived amino acid pairwise potentials, developed in the context of protein threading, is used to account for what we call the structural fitness of a sequence. We describe a model combining statistical potentials with an empirical amino acid substitution matrix. We propose such a combination as a useful way of capturing the complexity of protein evolution. Finally, we outline features of the model using three datasets and show the approach's sensitivity to different tree topologies. D
Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Syst Biol
, 2011
"... Abstract.—The marginal likelihood is commonly used for comparing different evolutionary models in Bayesian phylogenetics and is the central quantity used in computing Bayes Factors for comparing model fit. A popular method for estimating marginal likelihoods, the harmonic mean (HM) method, can be e ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
(Show Context)
Abstract.—The marginal likelihood is commonly used for comparing different evolutionary models in Bayesian phylogenetics and is the central quantity used in computing Bayes Factors for comparing model fit. A popular method for estimating marginal likelihoods, the harmonic mean (HM) method, can be easily computed from the output of a Markov chain Monte Carlo analysis but often greatly overestimates the marginal likelihood. The thermodynamic integration (TI) method is much more accurate than the HM method but requires more computation. In this paper, we introduce a new method, steppingstone sampling (SS), which uses importance sampling to estimate each ratio in a series (the “stepping stones”) bridging the posterior and prior distributions. We compare the performance of the SS approach to the TI and HM methods in simulation and using real data. We conclude that the greatly increased accuracy of the SS and TI methods argues for their use instead of the HM method, despite the extra computation needed. [Bayes factor; harmonic mean; phylogenetics, marginal likelihood;
Accurate branch length estimation in partitioned Bayesian analyses requires accommodation of amongpartition rate variation and attention to branch length priors. Syst Biol
, 2006
"... Molecular phylogenetic studies are making increasing use of partitioned Bayesian analyses via software tools like MrBayes, version 3 (Ronquist and Huelsenbeck, 2003). Data partitioning is important because, as long as the same topology/history underlies all of the partitions, it addresses some of t ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
(Show Context)
Molecular phylogenetic studies are making increasing use of partitioned Bayesian analyses via software tools like MrBayes, version 3 (Ronquist and Huelsenbeck, 2003). Data partitioning is important because, as long as the same topology/history underlies all of the partitions, it addresses some of the problems associated with the combination of data sets with heterogeneous rates (Bull et al., 1993) and eliminates the need to argue the validity of tests that have been used to judge data combinability (e.g., Huelsenbeck et al., 1994; Huelsenbeck