• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Phylogenetic analysis by maximum likelihood (PAML). Version 1.1. (1995)

by Z YANG
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 115
Next 10 →

Quartet puzzling: a quartet maximum likelihood method for reconstructing tree topologies.

by Korbinian Strimmer , Arndt Von Haeseler - Mol. Biol. Evol. , 1996
"... A versatile method, quartet puzzling, is introduced to reconstruct the topology (branching pattern) of a phylogenetic tree based on DNA or amino acid sequence data. This method applies maximum-likelihood tree reconstruction to all possible quartets that can be formed from n sequences. The quartet t ..."
Abstract - Cited by 433 (9 self) - Add to MetaCart
A versatile method, quartet puzzling, is introduced to reconstruct the topology (branching pattern) of a phylogenetic tree based on DNA or amino acid sequence data. This method applies maximum-likelihood tree reconstruction to all possible quartets that can be formed from n sequences. The quartet trees serve as starting points to reconstruct a set of optimal n-taxon trees. The majority rule consensus of these trees defines the quartet puzzling tree and shows groupings that are well supported. Computer simulations show that the performance of quartet puzzling to reconstruct the true tree is always equal to or better than that of neighbor joining. For some cases with high transition/transversion bias quartet puzzling outperforms neighbor joining by a factor of 10. The application of quartet puzzling to mitochondrial RNA and tRNAVd' sequences from amniotes demonstrates the power of the approach. A PHYLIP-compatible ANSI C program, PUZZLE, for analyzing nucleotide or amino acid sequence data is available.

Adaptive Molecular Evolution

by Z. Yang - In Balding,D., Bishop,M. and Cannings,C. (eds), Handbook of Statistical Genetics , 2001
"... INTRODUCTION While Darwin's theory of evolution by natural selection is accepted by biologists for morphological traits, the importance of selection in molecular evolution has been much debated. The neutral theory (Kimura, 1983) maintains that most observed molecular vari- ation (both diversit ..."
Abstract - Cited by 167 (20 self) - Add to MetaCart
INTRODUCTION While Darwin's theory of evolution by natural selection is accepted by biologists for morphological traits, the importance of selection in molecular evolution has been much debated. The neutral theory (Kimura, 1983) maintains that most observed molecular vari- ation (both diversity within species and divergence between species) is due to random fixation of mutations with fitness effects so small that random drift rather than natural selection dominates their fate. Population geneticists have developed a number of tests of neutrality (see Wayne and Simonsen, 1998, for a review). Those tests often easily reject the strictly neutral model when applied to real data. However, they are often unable to distinguish different forms of natural selection, or to demonstrate molecular adaptation. Up to now, the most convincing evidence of adaptive molecular evolution appears to have come from comparison of synonymous (silent) and non-synonymous (aminoacid -changing) substitution rate

Divergence time and evolutionary rate estimation with multilocus data

by Jeffrey L. Thorne, Hirohisa Kishino - Syst. Biol , 2002
"... Abstract.—Bayesian methods for estimating evolutionary divergence times are extended to multigene data sets, and a technique is described for detecting correlated changes in evolutionary rates among genes. Simulations are employed to explore the effect of multigene data on divergence time estimation ..."
Abstract - Cited by 167 (1 self) - Add to MetaCart
Abstract.—Bayesian methods for estimating evolutionary divergence times are extended to multigene data sets, and a technique is described for detecting correlated changes in evolutionary rates among genes. Simulations are employed to explore the effect of multigene data on divergence time estimation, and the methodology is illustrated with a previously published data set representing diverse plant taxa. The fact that evolutionary rates and times are confounded when sequence data are compared is emphasized and the importance of fossil information for disentangling rates and times is stressed. [Markov chain Monte Carlo; Metropolis–Hastings algorithm; molecular clock; phylogeny.] 689 Because of improved technology, molecular sequence data are becoming increasingly easy to collect. As a result, the pattern and process of evolution are being characterized in ever �ner detail. In the past, it was typical to infer evolutionary divergence times by selecting a single gene and then sequencing

Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes.

by Guillaume Blanc , Kenneth H Wolfe - Plant Cell , 2004
"... It is often anticipated that many of today's diploid plant species are in fact paleopolyploids. Given that an ancient large-scale duplication will result in an excess of relatively old duplicated genes with similar ages, we analyzed the timing of duplication of pairs of paralogous genes in 14 ..."
Abstract - Cited by 138 (3 self) - Add to MetaCart
It is often anticipated that many of today's diploid plant species are in fact paleopolyploids. Given that an ancient large-scale duplication will result in an excess of relatively old duplicated genes with similar ages, we analyzed the timing of duplication of pairs of paralogous genes in 14 model plant species. Using EST contigs (unigenes), we identified pairs of paralogous genes in each species and used the level of synonymous nucleotide substitution to estimate the relative ages of gene duplication. , and Arabidopsis thaliana), the age distributions of duplicated genes contain peaks corresponding to short evolutionary periods during which large numbers of duplicated genes were accumulated. Large-scale duplications (polyploidy or aneuploidy) are strongly suspected to be the cause of these temporal peaks of gene duplication. However, the unusual age profile of tandem gene duplications in Arabidopsis indicates that other scenarios, such as variation in the rate at which duplicated genes are deleted, must also be considered.
(Show Context)

Citation Context

...found, the pair of sequences was discarded. The nucleotide sequence was then translated using the Genewise program (which can infer frameshift sites; Birney et al., 1996) with the corresponding best match protein as a guide. For each pair of paralogs, the two translation products were then aligned using the Smith-Waterman algorithm (Smith and Waterman, 1981), and the resulting alignment was used as a guide to align the nucleotide sequences. After removing gaps and N-containing codons, the level of synonymous substitution was estimated using the maximum likelihood method implemented in codeml (Yang, 1999) under the F3x4 model (Goldman and Yang, 1994). Dataset Cleaning We first removed and stored separately all sequences annotated as transposable elements from the Arabidopsis and rice predicted genes. The number of pairs of paralogous sequences found in the full rice gene model dataset was too high for subsequent analyses (131,865 pairs). We therefore discarded 33,411 rice genemodels (out of 56,056) annotated as ‘‘hypothetical protein’’ to keep the number of pairs reasonable and focus on the most reliable genes. We then compiled a sequence dataset consisting of the Arabidopsis and rice genes an...

Phylogenetic relationships among ascomycetes, evidence from an RNA polymerase II subunit.

by Yajuan J Liu , Sally Whelen , Benjamin D Hall - Molecular Biology and Evolution , 1999
"... In an effort to establish a suitable alternative to the widely used 18S rRNA system for molecular systematics of fungi, we examined the nuclear gene RPB2, encoding the second largest subunit of RNA polymerase II. Because RPB2 is a single-copy gene of large size with a modest rate of evolutionary ch ..."
Abstract - Cited by 70 (1 self) - Add to MetaCart
In an effort to establish a suitable alternative to the widely used 18S rRNA system for molecular systematics of fungi, we examined the nuclear gene RPB2, encoding the second largest subunit of RNA polymerase II. Because RPB2 is a single-copy gene of large size with a modest rate of evolutionary change, it provides good phylogenetic resolution of Ascomycota. While the RPB2 and 18S rDNA phylogenies were highly congruent, the RPB2 phylogeny did result in much higher bootstrap support for all the deeper branches within the orders and for several branches between orders of the Ascomycota. There are several strongly supported phylogenetic conclusions. The Ascomycota is composed of three major lineages: Archiascomycetes, Saccharomycetales, and Euascomycetes. Within the Euascomycetes, plectomycetes, and pyrenomycetes are monophyletic groups, and the Pleosporales and Dothideales are distinct sister groups within the Loculoascomycetes. We confirm the placement of Neolecta within the Archiascomycetes, suggesting that fruiting body formation and forcible discharge of ascospores were characters gained early in the evolution of the Ascomycota. These findings show that a slowly evolving protein-coding gene such as RPB2 is useful for diagnosing phylogenetic relationships among fungi.

Gene finding with a hidden Markov model of genome structure and evolution

by Jacob Skou Pedersen , Jotun Hein , 2003
"... Motivation: A growing number of genomes are sequenced. The differences in evolutionary pattern between functional regions can thus be observed genome-wide in a whole set of organisms. The diverse evolutionary pattern of different functional regions can be exploited in the process of genomic annotati ..."
Abstract - Cited by 64 (9 self) - Add to MetaCart
Motivation: A growing number of genomes are sequenced. The differences in evolutionary pattern between functional regions can thus be observed genome-wide in a whole set of organisms. The diverse evolutionary pattern of different functional regions can be exploited in the process of genomic annotation. The modelling of evolution by the existing comparative gene finders leaves room for improvement. Results: Aprobabilistic model of both genome structure and evolution is designed. This type of model is called

Detecting Putative Orthologs

by D. P. Wall, H. B. Fraser, A. E. Hirsh , 2003
"... Summary: We developed an algorithm that improves upon the common procedure of taking reciprocal best blast hits (rbh) in the identification of orthologs. The method---reciprocal smallest distance algorithm (rsd)---relies on global sequence alignment and maximum likelihood estimation of evolutionary ..."
Abstract - Cited by 42 (5 self) - Add to MetaCart
Summary: We developed an algorithm that improves upon the common procedure of taking reciprocal best blast hits (rbh) in the identification of orthologs. The method---reciprocal smallest distance algorithm (rsd)---relies on global sequence alignment and maximum likelihood estimation of evolutionary distances to detect orthologs between two genomes. rsd finds many putative orthologs missed by rbh because it is less likely than rbh to be misled by the presence of a close paralog.
(Show Context)

Citation Context

...separately with the original query sequence i. If the alignable region of the two sequences exceeds a threshold fraction of the alignment’s total length (0.8 is our working cutoff), the program PAML (=-=Yang, 2000-=-) is used to obtain a maximum likelihood estimate of the number of amino acid substitutions separating the two protein sequences, given an empirical amino 1710 Bioinformatics 19(13) © Oxford Universit...

Tempo and mode of nucleotide substitutions in gag and env gene fragments in human immunodeficiency virus type 1 populations with a known transmission history.

by Thomas Leitner , Sudhir Kumar , Jan Albert - J Virol , 1997
"... The complex evolutionary process of human immunodeficiency virus type 1 (HIV-1) is marked by a high level of genetic variation. It has been shown that the HIV-1 genome is characterized by variable and more constant regions, unequal nucleotide frequencies, and preference for G-to-A substitutions. Ho ..."
Abstract - Cited by 37 (4 self) - Add to MetaCart
The complex evolutionary process of human immunodeficiency virus type 1 (HIV-1) is marked by a high level of genetic variation. It has been shown that the HIV-1 genome is characterized by variable and more constant regions, unequal nucleotide frequencies, and preference for G-to-A substitutions. However, this knowledge has largely been neglected in phylogenetic analyses of HIV-1 nucleotide sequences, even though these analyses are applied to a number of important biological questions. The purpose of this study was to identify a realistic model of HIV-1 evolution and to statistically test if the application of such a model significantly improves the accuracy of phylogenetic analyses. A unique and recently reported HIV-1 transmission cluster consisting of nine infected individuals, for whom the direction and time for each transmission were exactly known, formed the basis for the analyses which were performed under a general model of nucleotide substitution using population sequences from the env V3 and p17 gag regions of the HIV-1 genome. Examination of seven different substitution models by maximum-likelihood methods revealed that the fit of the general reversible (REV) model was significantly better than that of simpler models, indicating that it is important to account for the asymmetric substitution pattern of HIV-1 and that the nucleotide substitution rate varied significantly across sites. The shape parameter ␣, which describes the variation across sites by a gamma distribution, was estimated to be 0.38 and 0.25 for env V3 and p17 gag , respectively. In env V3, the estimated average transition/ transversion rate ratio was 1.42. Thus, the REV model with variable rates across sites (described by a gamma distribution) provides the best description of HIV-1 evolution, whereas simple models are unrealistic and inaccurate. It is likely that the accuracy of phylogenetic studies of HIV-1 and many other viruses would improve substantially by the use of more realistic nucleotide substitution models. This is especially true when attempts are made to estimate the age of distant viral ancestors from contemporary viral sequences.
(Show Context)

Citation Context

...ment (all three codon positions) were also evaluated. Fit of nucleotide substitution models. Several substitution models were investigated by maximum-likelihood calculations by using the program PAML =-=(51)-=-. All calculations were done by using the true tree topology of the investigated data set, ((256,(822,159)),((113,9939),(6760,((317,6767),((135,(529,105)), (719,136)))))), described above and shown in...

Biogeography and floral evolution of baobabs (Adansonia, Bombaceae) as inferred from multiple data sets

by David A. Baum, Randall L. Small, Jonathan F. W Endel - Systematic Biology , 1998
"... Abstract.Ð Th e phylogeny of baobab trees was an alyzed using four data sets: ch loroplast DNA restriction sites, sequences of the chloroplast rp l 16 intron, sequence s of the in ternal tran-scribed spacer (ITS) region of nuclear ribosomal DNA, and morphology. W e sampled each of the eight spe cie ..."
Abstract - Cited by 27 (1 self) - Add to MetaCart
Abstract.Ð Th e phylogeny of baobab trees was an alyzed using four data sets: ch loroplast DNA restriction sites, sequences of the chloroplast rp l 16 intron, sequence s of the in ternal tran-scribed spacer (ITS) region of nuclear ribosomal DNA, and morphology. W e sampled each of the eight spe cie s of Ada nsonia plus three outgroup taxa from tribe Adansonieae. These data were analyzed singly and in combination using parsimony. ITS and morphology provided the greatest resolution and were largely concordant. The two chlorop last data sets showed con-cordan ce with one another but showed signi ® cant con ¯ ict with ITS and morphology. A pos-sib le explanation for the con ¯ ict is genea logical discordan ce within the Malagasy Longitub a e, perhaps due to in trogression even ts. A maximum-likelihood analysis of b ranching times shows that the dispersal be tween Africa and Australia occurred well afte r the fragmen tation of Gondwana and the re fore involved overwate r dispersal. Th e phylogeny does not pe rmit unam-b iguous recon struction of ¯ oral evolution but suggests the plausib le hypothe sis that hawkmoth pollination was ancestral in Ada nsonia and that there were two parallel switches to pollination by m ammals in the genus. [ Biogeography, da ta set con ¯ ict, ¯ oral evolution, G ond-wana, introgression, molecu lar clock, phylogeny.]

Catarrhine primate divergence dates estimated from complete mitochondrial genomes: concordance with fossil and nuclear DNA evidence.

by Ryan L Raaum , Kirstin N Sterner , Colleen M Noviello , Caro-Beth Stewart , Todd R Disotell - Journal of Human Evolution, , 2005
"... Abstract Accurate divergence date estimates improve scenarios of primate evolutionary history and aid in interpretation of the natural history of disease-causing agents. While molecule-based estimates of divergence dates of taxa within the superfamily Hominoidea (apes and humans) are common in the ..."
Abstract - Cited by 26 (0 self) - Add to MetaCart
Abstract Accurate divergence date estimates improve scenarios of primate evolutionary history and aid in interpretation of the natural history of disease-causing agents. While molecule-based estimates of divergence dates of taxa within the superfamily Hominoidea (apes and humans) are common in the literature, few such estimates are available for the Cercopithecoidea (Old World monkeys), the sister taxon of the hominoids in the primate infraorder Catarrhini. To help fill this gap, we have sequenced the entire mitochondrial DNA (mtDNA) genomes from a representative of three cercopithecoid tribes, Cercopithecini (Chlorocebus aethiops), Colobini (Colobus guereza), and Presbytini (Trachypithecus obscurus), and analyzed these new data together with other catarrhine mtDNA genomes available in public databases. Molecular divergence date estimates are dependent on calibration points gleaned from the paleontological record. We defined criteria for the selection of good calibration points and identified three points meeting these criteria: HomoPan, 6.0 Ma; Pongo-hominines, 14.0 Ma; hominoid/cercopithecoid, 23.0 Ma. Because a uniform molecular clock does not fit the catarrhine mtDNA data, we estimated divergence dates using a penalized likelihood and a Bayesian method, both of which take into account the effects of rate differences on lineages, phylogenetic tree structure, and multiple calibration points. The penalized likelihood method applied to the coding regions of the mtDNA genome yielded the following divergence date estimates, with approximate 95% confidence intervals: cercopithecine-colobine, 16.2 (14.4-17.9) Ma; colobin-presbytin, 10.9 (9.6-12.3) Ma; cercopithecin-papionin, 11.6 (10.3-12.9) Ma; and Macaca-Papio, 9.8 (8.6-10.9) Ma. Within the hominoids, the following dates were inferred: hylobatid-hominid, 16.8 (15.0-18.5) Ma; GorillaHomo C Pan, 8.1 (7.1-9.0) Ma; Pongo pygmaeus pygmaeus-P. p. abelii, 4.1 (3.5-4.7) Ma; and Pan troglodytes-P. paniscus, 2.4 (2.0-2.7) Ma. These dates were similar to those found using penalized likelihood on other subsets of the data, but slightly younger than several of the Bayesian estimates.
(Show Context)

Citation Context

...f these methods allow di!erent rates on di!erent lineages, but assume that rates are constant on individual branches (Thorne et al., 1998). Sanderson (2002) implemented a semi-parametric method wherein each lineage has a separate rate that is limited from varying too much across the phylogeny. We used this method to estimate divergence dates using three datasets (All Genes, HSP, and HS12P), with confidence intervals calculated via a bootstrap procedure. Each alignment was re-sampled 100 times. For each of these 100 replicates, ML branch lengths were calculated using the PAML software package (Yang, 2003). Branch lengths estimated byPAMLwere employed in the penalized likelihood date estimation method implemented in the r8s computer program (Sanderson, 2003). The three fossil-derived divergence dates described above were used as "point estimates" to calibrate the ML trees. The bootstrap sample was tested for normality (Shapiro-Wilks test for normality; Royston, 1995). Since the sample passed the normality test, we estimated the 95% confidence interval for the uncertainty resulting from statistical error in branch length estimation and stochasticity in the molecular evolutionary process as 2.576...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University