Results 1  10
of
583
Raxmliii: a fast program for maximum likelihoodbased inference of large phylogenetic trees
 Bioinformatics
, 2005
"... Motivation: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to ..."
Abstract

Cited by 114 (12 self)
 Add to MetaCart
Motivation: The computation of large phylogenetic trees with statistical models such as maximum likelihood or bayesian inference is computationally extremely intensive. It has repeatedly been demonstrated that these models are able to recover the true tree or a tree which is topologically closer to the true tree more frequently than less elaborate methods such as parsimony or neighbor joining. Due to the combinatorial and computational complexity the size of trees which can be computed on a Biologist’s PC workstation within reasonable time is limited to trees containing approximately 100 taxa. Results: In this paper we present the latest release of our program RAxMLIII for rapid maximum likelihoodbased inference of large evolutionary trees which allows for computation of 1.000taxon trees in less than 24 hours on a single PC processor. We compare RAxMLIII to the currently fastest implementations for maximum likelihood and bayesian inference: PHYML and MrBayes. Whereas RAxMLIII performs worse than PHYML and MrBayes on synthetic data it clearly outperforms both programs on all real data alignments used in terms of speed and final likelihood values. Availability & Supplementary Information: RAxMLIII including all alignments and final trees mentioned in this paper is freely available as open source code at
BEAST: Bayesian evolutionary analysis by sampling trees BMC Evolutionary Biology 2007, 7:214 doi:10.1186/147121487214
, 2007
"... PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. ..."
Abstract

Cited by 98 (3 self)
 Add to MetaCart
PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon.
Gascuel O. Approximate LikelihoodRatio Test for Branches: A
 Fast, Accurate, and Powerful Alternative. Systematic Biology
"... Abstract.—We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihoodratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLR ..."
Abstract

Cited by 86 (4 self)
 Add to MetaCart
Abstract.—We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihoodratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the 1 2 1 2 χ 2 0 + χ
Phylogenomic inference of protein molecular function: advances and challenges
 Bioinformatics
, 2004
"... Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis—combining phylogenetic tree cons ..."
Abstract

Cited by 51 (2 self)
 Add to MetaCart
Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis—combining phylogenetic tree construction, integration of experimental data and differentiation of orthologs and paralogs—has been proposed to address these errors and improve the accuracy of functional classification. The explicit integration of structure prediction and analysis in this framework, which we call structural phylogenomics, provides additional insights into protein superfamily evolution. Results: Results of protein functional classification using phylogenomic analysis show fewer expected false positives overall than when pairwise methods of functional classification are employed. We present an overview of the motivations and fundamental principles of phylogenomic analysis, new methods developed for the key tasks, benchmark datasets for these tasks (when available) and suggest procedures to increase accuracy. We also discuss some of the methods used in the Celera Genomics highthroughput phylogenomic classification of the human genome. Availability: Software tools from the Berkeley Phylogenomics Group are available at
H: Computing Bayes factors using thermodynamic integration
 Syst Biol
"... Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean ..."
Abstract

Cited by 33 (5 self)
 Add to MetaCart
Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present article, we propose to employ another method, based on an analogy with statistical physics, called thermodynamic integration. We describe the method, propose an implementation, and show on two analytical examples that this numerical method yields reliable estimates. In contrast, the harmonic mean estimator leads to a strong overestimation of the marginal likelihood, which is all the more pronounced as the model is higher dimensional. As a result, the harmonic mean estimator systematically favors more parameterrich models, an artefact that might explain some recent puzzling observations, based on harmonic mean estimates, suggesting that Bayes factors tend to overscore complex models. Finally, we apply our method to the comparison of several alternative models of aminoacid replacement. We confirm our previous observations, indicating that modeling pattern heterogeneity across sites tends to yield better models than standard empirical matrices. [Bayes factor; harmonic mean; mixture model; path sampling; phylogeny; thermodynamic integration.] Bayesian methods have become popular in molecular phylogenetics over the recent years. The simple and intuitive interpretation of the concept of probabilities
Structured statistical models of inductive reasoning
"... Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge, and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. We present a Baye ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge, and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. We present a Bayesian framework that attempts to meet both goals and describe four applications of the framework: a taxonomic model, a spatial model, a threshold model, and a causal model. Each model makes probabilistic inferences about the extensions of novel properties, but the priors for the four models are defined over different kinds of structures that capture different relationships between the categories in a domain. Our framework therefore shows how statistical inference can operate over structured background knowledge, and we argue that this interaction between structure and statistics is critical for explaining the power and flexibility of human reasoning.
AJ: Bayesian inference of species trees from multilocus data
 Mol Biol Evol
"... Until recently, it has been common practice for a phylogenetic analysis to use a single gene sequence from a single individual organism as a proxy for an entire species. With technological advances, it is now becoming more common to collect data sets containing multiple gene loci and multiple indivi ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
Until recently, it has been common practice for a phylogenetic analysis to use a single gene sequence from a single individual organism as a proxy for an entire species. With technological advances, it is now becoming more common to collect data sets containing multiple gene loci and multiple individuals per species. These data sets often reveal the need to directly model intraspecies polymorphism and incomplete lineage sorting in phylogenetic estimation procedures. For a single species, coalescent theory is widely used in contemporary population genetics to model intraspecific gene trees. Here, we present a Bayesian Markov chain Monte Carlo method for the multispecies coalescent. Our method coestimates multiple gene trees embedded in a shared species tree along with the effective population size of both extant and ancestral species. The inference is made possible by multilocus data from multiple individuals per species. Using a multiindividual data set and a series of simulations of rapid species radiations, we demonstrate the efficacy of our new method. These simulations give some insight into the behavior of the method as a function of sampled individuals, sampled loci, and sequence length. Finally, we compare our new method to both an existing method (BEST 2.2) with similar goals and the supermatrix (concatenation) method. We demonstrate that both BEST and our method have much better estimation accuracy for species tree topology than concatenation, and our method outperforms BEST in divergence time and population size estimation.
Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference. Bioinformatics
, 2004
"... Motivation: Bayesian estimation of phylogeny is based on the posterior probability distribution of trees. Currently, the only numerical method that can effectively approximate posterior probabilities of trees is Markov chain Monte Carlo (MCMC). Standard implementations of MCMC can be prone to entrap ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
Motivation: Bayesian estimation of phylogeny is based on the posterior probability distribution of trees. Currently, the only numerical method that can effectively approximate posterior probabilities of trees is Markov chain Monte Carlo (MCMC). Standard implementations of MCMC can be prone to entrapment in local optima. Metropolis coupled MCMC [(MC) 3], a variant of MCMC, allows multiple peaks in the landscape of trees to be more readily explored, but at the cost of increased execution time. Results: This paper presents a parallel algorithm for (MC) 3. The proposed parallel algorithm retains the ability to explore multiple peaks in the posterior distribution of trees while maintaining a fast execution time. The algorithm has been implemented using two popular parallel programming models: message passing and shared memory. Performance results indicate nearly linear speed improvement in both programming models for small and large data sets. Availability: MrBayes v3.0 is available at
Bayesian models of cognition
"... For over 200 years, philosophers and mathematicians have been using probability theory to describe human cognition. While the theory of probabilities was first developed as a means of analyzing games of chance, it quickly took on a larger and deeper significance as a formal account of how rational a ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
For over 200 years, philosophers and mathematicians have been using probability theory to describe human cognition. While the theory of probabilities was first developed as a means of analyzing games of chance, it quickly took on a larger and deeper significance as a formal account of how rational agents should reason in situations of uncertainty
Learning domain structures
 In Proceedings of the 26th Annual Conference of the Cognitive Science Society
, 2004
"... How do people acquire and use knowledge about domain structures, such as the treestructured taxonomy of folk biology? These structures are typically seen either as consequences of innate domainspecific knowledge or as epiphenomena of domaingeneral associative learning. We present an alternative: ..."
Abstract

Cited by 22 (13 self)
 Add to MetaCart
How do people acquire and use knowledge about domain structures, such as the treestructured taxonomy of folk biology? These structures are typically seen either as consequences of innate domainspecific knowledge or as epiphenomena of domaingeneral associative learning. We present an alternative: a framework for statistical inference that discovers the structural principles that best account for different domains of objects and their properties. Our approach infers that a tree structure is best for a biological dataset, and a linear structure (“left”–“right”) is best for a dataset of people and their political views. We compare our proposal with unstructured associative learning and argue that our structured approach gives the better account of inductive