• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Bayesian selection of nucleotide substitution models and their site assignments. (2013)

by C-H Wu, Suchard MA, Drummond AJ
Venue:Mol Biol Evol.
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

rack An Experimentally Determined Evolutionary Model Dramatically Improves Phylogenetic Fit

by Jesse D. Bloom
"... All modern approaches to molecular phylogenetics require a quantitative model for how genes evolve. Unfortunately, existing evolutionary models do not realistically represent the site-heterogeneous selection that governs actual sequence change. Attempts to remedy this problem have involved augmentin ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
All modern approaches to molecular phylogenetics require a quantitative model for how genes evolve. Unfortunately, existing evolutionary models do not realistically represent the site-heterogeneous selection that governs actual sequence change. Attempts to remedy this problem have involved augmenting these models with a burgeoning number of free parameters. Here, I demonstrate an alternative: Experimental determination of a parameter-free evolutionary model via mutagenesis, functional selection, and deep sequencing. Using this strategy, I create an evolutionary model for influenza nucleoprotein that describes the gene phylogeny far better than existing models with dozens or even hundreds of free parameters. Emerging high-throughput experimental strategies such as the one employed here provide fundamentally new information that has the potential to transform the sensitivity of phylogenetic and genetic analyses.

in Biology By

by Gene Core Promoter, Jacquelyn Saunders
"... 2013 ii The thesis of Jacquelyn Saunders is approved: _______________________________ _ _______________ Mary-Patricia Stein, Ph.D. Date _______________________________ _ _______________ Rheem D. Medh, Ph. D. Date _____ ..."
Abstract - Add to MetaCart
2013 ii The thesis of Jacquelyn Saunders is approved: _______________________________ _ _______________ Mary-Patricia Stein, Ph.D. Date _______________________________ _ _______________ Rheem D. Medh, Ph. D. Date _______________________________ _ _______________ Cindy Malone, Ph.D. Chair Date
(Show Context)

Citation Context

...ubstitution models.sSubstitution models are used to describe the relative rates of change betweenscharacter states in a phylogenetic tree for each nucleotide position in a multiple sequencesalignment =-=(93)-=-. Because genetic sequences change gradually over time these changes inscharacter state can be used to infer common ancestry between species. Substitutions12smodels vary in their approach. For example...

unknown title

by Diego Darriba, David Posada
"... re ..."
Abstract - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...have been proposed that assign each site within a locus a probability of evolving under a given rate (Yang, 1994), substitution pattern (Lartillot and Philippe, 2004; Pagel and Meade, 2004), or both (=-=Wu et al., 2013-=-). In particular, a discrete gamma distribution to consider rate variation among sites (Yang, 1996) is used nowadays in practically any phylogenetic analysis. A different approach to account for the h...

Article The Effects of Partitioning on Phylogenetic Inference

by David Kainer , Robert Lanfear
"... Abstract Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects ..."
Abstract - Add to MetaCart
Abstract Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects of different approaches to partitioning across many empirical data sets. In this study, we applied four commonly used approaches to partitioning to each of 34 empirical data sets, and then compared the resulting tree topologies, branch-lengths, and bootstrap support estimated using each approach. We find that the choice of partitioning scheme often affects tree topology, particularly when partitioning is omitted. Most notably, we find occasional instances where the use of a suboptimal partitioning scheme produces highly supported but incorrect nodes in the tree. Branch-lengths and bootstrap support are also affected by the choice of partitioning scheme, sometimes dramatically so. We discuss the reasons for these effects and make some suggestions for best practice.
(Show Context)

Citation Context

...th an algorithmically optimized scheme appear to decline as the number of sites per data block increases, and it may be possible to use such measures to help decide whether algorithmic optimization of partitioning schemes is necessary for any given data set, although such analyses may be more time consuming than the algorithmic optimization itself. Various methods exist that make use of the data to automatically generate data blocks with increased accuracy. Partitioning by automatically grouping sites based on site rates (Kjer et al. 2001; Kjer and Honeycutt 2007) or through Bayesian methods (Wu et al. 2013) are two such approaches that warrant further investigation. We did not apply these methods in our study as many the data sets we analyzed are too large to be processed with many of these methods. Finally, experimental determination of evolutionary models (Bloom 2014) may eventually obviate the need for partitioning in data in certain very well-characterized cases. Conclusions Various approaches to partitioning have been established to improve the way evolutionary heterogeneity is accounted for in phylogenetic studies. We analyzed the effects of four common partitioning approaches on a large c...

A rticle An Experimentally Informed Evolutionary Model Improves Phylogenetic Fit to Divergent Lactamase Homologs

by Jesse D. Bloom
"... Phylogenetic analyses of molecular data require a quantitative model for how sequences evolve. Traditionally, the details of the site-specific selection that governs sequence evolution are not known a priori, making it challenging to create evolutionarymodels that adequately capture the heterogeneit ..."
Abstract - Add to MetaCart
Phylogenetic analyses of molecular data require a quantitative model for how sequences evolve. Traditionally, the details of the site-specific selection that governs sequence evolution are not known a priori, making it challenging to create evolutionarymodels that adequately capture the heterogeneity of selection at different sites. However, recent advances in high-throughput experiments havemade it possible to quantify the effects of all single mutations on gene function. I have previously shown that such high-throughput experiments can be combined with knowledge of underlying mutation rates to create a parameter-free evolutionary model that describes the phylogeny of influenza nucleoprotein far better than commonly used existing models. Here, I extend this work by showing that published experimental data on TEM-1 beta-lactamase (Firnberg E, Labonte JW, Gray JJ, Ostermeier M. 2014. A comprehensive, high-resolution map of a gene’s fitness landscape.Mol Biol Evol. 31:1581–1592) can be combined with a few mutation rate parameters to create an evolutionary model that describes beta-lactamase phylogenies much better than most common existing models. This experimentally informed evolutionary model is superior even for homologs that are substantially diverged (about 35 % divergence at the protein level) from the TEM-1 parent that was the subject of the experimental study. These results suggest that exper-imental measurements can inform phylogenetic evolutionary models that are applicable to homologs that span a substantial range of sequence divergence. Key words: phylogenetics, deep mutational scanning, lactamase, protein evolution, substitution model.

METHODOLOGY ARTICLE Automatic selection of pa in

by unknown authors
"... [1,2]. Inaccurate tree reconstructions can be the result of phylogenetic frameworks, can help to reduce systematic ..."
Abstract - Add to MetaCart
[1,2]. Inaccurate tree reconstructions can be the result of phylogenetic frameworks, can help to reduce systematic
(Show Context)

Citation Context

... feasibly analyzed by brute force. A related approach is to allow the data inform the assignment of sites to subsets, and to integrate out the uncertainty in these assignments in a Bayesian framework =-=[35]-=-. Although this method is elegant, it has a high computational burden that renders it impractical for all but modestly sized datasets. The most commonly used method for partitioning alignments, and th...

A Simulation Approach for Change-Points on Phylogenetic Trees

by Adam Persing, Ajay Jasra, Alexandros Beskos, David Balding, Maria De Iorio
"... We observe n sequences at each of m sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all s ..."
Abstract - Add to MetaCart
We observe n sequences at each of m sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a trans-dimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the trans-dimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques.
(Show Context)

Citation Context

...In this case, the distribution of each site on the sequence is a mixture of multiple processes, each of which may have its own tree topology, branch lengths and substitution rates (e.g. [16, 3, 17]). =-=[18]-=- extend these ideas to infinite mixtures assuming a Dirichlet process prior. 3 The main focus of this paper is estimating evolutionary rate variation across sites assuming that the tree topology and b...

A rticle OrthoMaM v8: A Database of Orthologous Exons and Coding Sequences for Comparative Genomics in Mammals

by Associate Xun Gu
"... Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we develo ..."
Abstract - Add to MetaCart
Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic frame-work. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several meth-odological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at
(Show Context)

Citation Context

... high-quality codon alignments have also been utilized as benchmark empirical data sets for testing new analytical methods (Egan et al. 2008; López-Giráldez and Townsend 2011; Li and Drummond 2012; =-=Wu et al. 2013-=-) and for detecting footprints of purifying or positive selection (Jobson et al. 2010; Laguette et al. 2012). Finally, the inferred ML gene trees have served for assessing the performance of supertree...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University