Results 1 - 10
of
239
BEAST: Bayesian evolutionary analysis by sampling trees BMC Evolutionary Biology 2007, 7:214 doi:10.1186/1471-2148-7-214
, 2007
"... PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon.
Computing Bayes Factors Using Thermodynamic Integration
, 2005
"... In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present paper, we propose to employ another method, based on an analogy with statistical physics, called thermodynamic integration.
Structured statistical models of inductive reasoning
"... Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge, and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. We present a Baye ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Everyday inductive inferences are often guided by rich background knowledge. Formal models of induction should aim to incorporate this knowledge, and should explain how different kinds of knowledge lead to the distinctive patterns of reasoning found in different inductive contexts. We present a Bayesian framework that attempts to meet both goals and describe four applications of the framework: a taxonomic model, a spatial model, a threshold model, and a causal model. Each model makes probabilistic inferences about the extensions of novel properties, but the priors for the four models are defined over different kinds of structures that capture different relationships between the categories in a domain. Our framework therefore shows how statistical inference can operate over structured background knowledge, and we argue that this interaction between structure and statistics is critical for explaining the power and flexibility of human reasoning.
An Investigation of Phylogenetic Likelihood Methods
, 2003
"... We analyze the performance of likelihood-based approaches used to reconstruct phylogenetic trees. Unlike other techniques such as Neighbor-Joining (NJ) and Maximum Parsimony (MP), relatively little is known regarding the behavior of algorithms founded on the principle of likelihood. ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We analyze the performance of likelihood-based approaches used to reconstruct phylogenetic trees. Unlike other techniques such as Neighbor-Joining (NJ) and Maximum Parsimony (MP), relatively little is known regarding the behavior of algorithms founded on the principle of likelihood.
Bayesian models of cognition
"... For over 200 years, philosophers and mathematicians have been using probability theory to describe human cognition. While the theory of probabilities was first developed as a means of analyzing games of chance, it quickly took on a larger and deeper significance as a formal account of how rational a ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
For over 200 years, philosophers and mathematicians have been using probability theory to describe human cognition. While the theory of probabilities was first developed as a means of analyzing games of chance, it quickly took on a larger and deeper significance as a formal account of how rational agents should reason in situations of uncertainty
Parallel Metropolis-Coupled Markov Chain Monte Carlo for Bayesian Phylogenetic Inference
, 2003
"... Motivation: Bayesian estimation of phylogeny is based on the posterior probability distribution of trees. Currently, the only numerical method that can effectively approximate posterior probabilities of trees is Markov Chain Monte Carlo (MCMC). Standard implementations of MCMC can be prone to entrap ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Motivation: Bayesian estimation of phylogeny is based on the posterior probability distribution of trees. Currently, the only numerical method that can effectively approximate posterior probabilities of trees is Markov Chain Monte Carlo (MCMC). Standard implementations of MCMC can be prone to entrapment in local optima. A variant of MCMC, known as Metropolis-Coupled MCMC allows multiple peaks in the landscape of trees to be more readily explored, but at the cost of increased execution time. Results: This paper presents a parallel algorithm for Metropolis-Coupled MCMC. The proposed parallel algorithm retains the ability to explore multiple peaks in the posterior distribution of trees while maintaining a fast execution time. The algorithm has been implemented using two popular parallel programming models: message passing and shared memory. Performance results indicate nearly linear speed improvement in both programming models for small and large data sets. Availability: MrBayes v3.0 is available at http://morphbank.ebc.uu.se/mrbayes3/.
Very fast algorithms for evaluating the stability of ML and Bayesian phylogenetic trees from sequence data
- Genome Informatics
, 2002
"... Evolutionary trees sit at the core of all realistic models describing a set of related sequences, including alignment, homology search, ancestral protein reconstruction and 2D/3D structural change. It is important to assess the stochastic error when estimating a tree, including models using the most ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Evolutionary trees sit at the core of all realistic models describing a set of related sequences, including alignment, homology search, ancestral protein reconstruction and 2D/3D structural change. It is important to assess the stochastic error when estimating a tree, including models using the most realistic likelihood-based optimizations, yet computation times may be many days or weeks. If so, the bootstrap is computationally prohibitive. Here we show that the extremely fast “resampling of estimated log likelihoods ” or RELL method behaves well under more general circumstances than previously examined. RELL approximates the bootstrap (BP) proportions of trees better that some bootstrap methods that rely on fast heuristics to search the tree space. The BIC approximation of the Bayesian posterior probability (BPP) of trees is made more accurate by including an additional term related to the determinant of the information matrix (which may also be obtained as a product of gradient or score vectors). Such estimates are shown to be very close to MCMC chain values. Our analysis of mammalian mitochondrial amino acid sequences suggest that when model breakdown occurs, as it typically does for sequences separated by more than a few million years, the BPP values are far too peaked and the real fluctuations in the likelihood of the data
Phylogenetic models of rate heterogeneity: A high performance computing perspective
- In Proceedings of the 20th Internationational Parallel and Distributed Processing Symposium (IPDPS
, 2006
"... Inference of phylogenetic trees using the maximum likelihood (ML) method is NP-hard. Furthermore, the computation of the likelihood function for huge trees of more than 1,000 organisms is computationally intensive due to a large amount of floating point operations and high memory consumption. Within ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Inference of phylogenetic trees using the maximum likelihood (ML) method is NP-hard. Furthermore, the computation of the likelihood function for huge trees of more than 1,000 organisms is computationally intensive due to a large amount of floating point operations and high memory consumption. Within this context, the present paper compares two competing mathematical models that account for evolutionary rate heterogeneity: the Γ and CAT models. The intention of this paper is to show that—from a purely empirical point of view—CAT can be used instead of Γ. The main advantage of CAT over Γ consists in significantly lower memory consumption and faster inference times. An experimental study using RAxML has been performed on 19 real-world datasets comprising 73 up to 1,663 DNA sequences. Results show that CAT is on average 5.5 times faster than Γ and—surprisingly enough—also yields trees with slightly superior Γ likelihood values. The usage of the CAT model decreases the amount of average L2 and L3 cache misses by factor 8.55. 1.
topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association
, 2004
"... The database of topographic mapping of Single Nucleotide Polymorphism (topoSNP) provides an online resource for analyzing non-synonymous SNPs (nsSNPs) that can be mapped onto known 3D structures of proteins. These include diseaseassociated nsSNPs derived from the Online Mendelian Inheritance in Man ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The database of topographic mapping of Single Nucleotide Polymorphism (topoSNP) provides an online resource for analyzing non-synonymous SNPs (nsSNPs) that can be mapped onto known 3D structures of proteins. These include diseaseassociated nsSNPs derived from the Online Mendelian Inheritance in Man (OMIM) database and other nsSNPs derived from dbSNP, a resource at the National Center for Biotechnology Information that catalogs SNPs. TopoSNP further classies each nsSNP site into three categories based on their geometric location: those located in a surface pocket or an interior void of the protein, those on a convex region or a shallow depressed region, and those that are completely buried in the interior of the protein structure. These unique geometric descriptions provide more detailed mapping of nsSNPs to protein structures. The current release also includes relative entropy of SNPs calculated from multiple sequence alignment as obtained from the Pfam database (a database of protein families and conserved protein motifs) as well as manually adjusted multiple alignments obtained from ClustalW. These structural and conservational data can be useful for studying whether nsSNPs in coding regions are likely to lead to phenotypic changes. TopoSNP includes an interactive structural visualization web interface, as well as downloadable batch data. The database will be updated at regular intervals and can be accessed at: http:// gila.bioengr.uic.edu/snp/toposnp.
Settings in social networks: A measurement model
- Sociological Methodology
, 2003
"... A class of statistical models is proposed which aims to recover latent settings structures in social networks. Settings may be regarded as clusters of vertices. The measurement model builds on two assumptions. The observed network is assumed to be generated by hierarchically nested latent transitive ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
A class of statistical models is proposed which aims to recover latent settings structures in social networks. Settings may be regarded as clusters of vertices. The measurement model builds on two assumptions. The observed network is assumed to be generated by hierarchically nested latent transitive structures, expressed by ultrametrics. It is assumed that expected tie strength decreases with ultrametric distance. The approach could be described as model-based clustering with an ultrametric space as the underlying metric to capture the dependence in the observations. Maximum likelihood methods as well as Bayesian methods are applied for statistical inference. Both approaches are implemented using Markov chain Monte Carlo methods. 1.

