Results 1 -
8 of
8
rack An Experimentally Determined Evolutionary Model Dramatically Improves Phylogenetic Fit
"... All modern approaches to molecular phylogenetics require a quantitative model for how genes evolve. Unfortunately, existing evolutionary models do not realistically represent the site-heterogeneous selection that governs actual sequence change. Attempts to remedy this problem have involved augmentin ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
All modern approaches to molecular phylogenetics require a quantitative model for how genes evolve. Unfortunately, existing evolutionary models do not realistically represent the site-heterogeneous selection that governs actual sequence change. Attempts to remedy this problem have involved augmenting these models with a burgeoning number of free parameters. Here, I demonstrate an alternative: Experimental determination of a parameter-free evolutionary model via mutagenesis, functional selection, and deep sequencing. Using this strategy, I create an evolutionary model for influenza nucleoprotein that describes the gene phylogeny far better than existing models with dozens or even hundreds of free parameters. Emerging high-throughput experimental strategies such as the one employed here provide fundamentally new information that has the potential to transform the sensitivity of phylogenetic and genetic analyses.
in Biology By
"... 2013 ii The thesis of Jacquelyn Saunders is approved: _______________________________ _ _______________ Mary-Patricia Stein, Ph.D. Date _______________________________ _ _______________ Rheem D. Medh, Ph. D. Date _____ ..."
Abstract
- Add to MetaCart
(Show Context)
2013 ii The thesis of Jacquelyn Saunders is approved: _______________________________ _ _______________ Mary-Patricia Stein, Ph.D. Date _______________________________ _ _______________ Rheem D. Medh, Ph. D. Date _______________________________ _ _______________ Cindy Malone, Ph.D. Chair Date
Article The Effects of Partitioning on Phylogenetic Inference
"... Abstract Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects of different approaches to partitioning across many empirical data sets. In this study, we applied four commonly used approaches to partitioning to each of 34 empirical data sets, and then compared the resulting tree topologies, branch-lengths, and bootstrap support estimated using each approach. We find that the choice of partitioning scheme often affects tree topology, particularly when partitioning is omitted. Most notably, we find occasional instances where the use of a suboptimal partitioning scheme produces highly supported but incorrect nodes in the tree. Branch-lengths and bootstrap support are also affected by the choice of partitioning scheme, sometimes dramatically so. We discuss the reasons for these effects and make some suggestions for best practice.
A rticle An Experimentally Informed Evolutionary Model Improves Phylogenetic Fit to Divergent Lactamase Homologs
"... Phylogenetic analyses of molecular data require a quantitative model for how sequences evolve. Traditionally, the details of the site-specific selection that governs sequence evolution are not known a priori, making it challenging to create evolutionarymodels that adequately capture the heterogeneit ..."
Abstract
- Add to MetaCart
Phylogenetic analyses of molecular data require a quantitative model for how sequences evolve. Traditionally, the details of the site-specific selection that governs sequence evolution are not known a priori, making it challenging to create evolutionarymodels that adequately capture the heterogeneity of selection at different sites. However, recent advances in high-throughput experiments havemade it possible to quantify the effects of all single mutations on gene function. I have previously shown that such high-throughput experiments can be combined with knowledge of underlying mutation rates to create a parameter-free evolutionary model that describes the phylogeny of influenza nucleoprotein far better than commonly used existing models. Here, I extend this work by showing that published experimental data on TEM-1 beta-lactamase (Firnberg E, Labonte JW, Gray JJ, Ostermeier M. 2014. A comprehensive, high-resolution map of a gene’s fitness landscape.Mol Biol Evol. 31:1581–1592) can be combined with a few mutation rate parameters to create an evolutionary model that describes beta-lactamase phylogenies much better than most common existing models. This experimentally informed evolutionary model is superior even for homologs that are substantially diverged (about 35 % divergence at the protein level) from the TEM-1 parent that was the subject of the experimental study. These results suggest that exper-imental measurements can inform phylogenetic evolutionary models that are applicable to homologs that span a substantial range of sequence divergence. Key words: phylogenetics, deep mutational scanning, lactamase, protein evolution, substitution model.
METHODOLOGY ARTICLE Automatic selection of pa in
"... [1,2]. Inaccurate tree reconstructions can be the result of phylogenetic frameworks, can help to reduce systematic ..."
Abstract
- Add to MetaCart
(Show Context)
[1,2]. Inaccurate tree reconstructions can be the result of phylogenetic frameworks, can help to reduce systematic
A Simulation Approach for Change-Points on Phylogenetic Trees
"... We observe n sequences at each of m sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all s ..."
Abstract
- Add to MetaCart
(Show Context)
We observe n sequences at each of m sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a trans-dimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the trans-dimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques.
A rticle OrthoMaM v8: A Database of Orthologous Exons and Coding Sequences for Comparative Genomics in Mammals
"... Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we develo ..."
Abstract
- Add to MetaCart
(Show Context)
Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic frame-work. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several meth-odological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at