Results 1  10
of
24
Steps Toward Accurate Reconstructions of Phylogenies from GeneOrder Data
 J. COMPUT. SYST. SCI
, 2002
"... ..."
(Show Context)
Industrial Applications of HighPerformance Computing for Phylogeny Reconstruction
, 2001
"... Phylogenies (that is, treeoflife relationships) derived from gene order data may prove crucial in answering some fundamental open questions in biomolecular evolution. Realworld interest is strong in determining these relationships. For example, pharmaceutical companies may use phylogeny reconstru ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
Phylogenies (that is, treeoflife relationships) derived from gene order data may prove crucial in answering some fundamental open questions in biomolecular evolution. Realworld interest is strong in determining these relationships. For example, pharmaceutical companies may use phylogeny reconstruction in drug discovery for finding plants with similar gene production. Health organizations study the evolution and spread of viruses such as HIV to gain understanding of future outbreaks. And governments are interested in aiding the production of foodstuffs like rice, wheat, and corn, by understanding the genetic code. Yet very few techniques are available for such phylogenetic reconstructions. Appropriate tools for analyzing such data may help resolve some difficult phylogenetic reconstruction problems; indeed, this new source of data has been embraced by many biologists in their phylogenetic work. With the rapid accumulation of whole genome sequences for a wide diversity of taxa, phylogenetic reconstruction based on changes in gene order and gene content is showing promise, particularly for resolving deep (i.e., old) branches. However, reconstruction from geneorder data is even more computationally intensive than reconstruction from sequence data, particularly in groups with large numbers of genes and highly rearranged genomes. We have developed a software suite, GRAPPA, that extends the breakpoint analysis (BPAnalysis) method of Sankoff and Blanchette while running much faster: in a recent analysis of a collection of chloroplast data for species of Campanulaceae on a 512processor Linux supercluster with Myrinet, we achieved a onemillionfold speedup over BPAnalysis. GRAPPA currently can use either breakpoint or inversion distance (computed exactly) for its computati...
Adaptive Allocation of Independent Tasks to Maximize Throughput
"... www.library.drexel.edu The following item is made available as a courtesy to scholars by the author(s) and Drexel University Library and may contain materials and content, including computer code and tags, artwork, text, graphics, images, and illustrations (Material) which may be protected by copyri ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
(Show Context)
www.library.drexel.edu The following item is made available as a courtesy to scholars by the author(s) and Drexel University Library and may contain materials and content, including computer code and tags, artwork, text, graphics, images, and illustrations (Material) which may be protected by copyright law. Unless otherwise noted, the Material is made available for non profit and educational purposes, such as research, teaching and private study. For these limited purposes, you may reproduce (print, download or make copies) the Material without prior permission. All copies must include any copyright notice originally included with the Material. You must seek permission from the authors or copyright owners for all uses that are not allowed by fair use and other provisions of the U.S. Copyright Law. The responsibility for making an independent legal assessment and securing any necessary permission rests with persons desiring to reproduce or use the Material.
Parallel algorithms for Bayesian phylogenetic inference
 Journal of Parallel Distributed Computing
, 2003
"... The combination of a Markov chain Monte Carlo (MCMC) method with likelihoodbased assessment of phylogenies is becoming a popular alternative to direct likelihood optimization. However, MCMC, like maximum likelihood, is a computationallyexpensive method. To approximate the posterior distribution of ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
The combination of a Markov chain Monte Carlo (MCMC) method with likelihoodbased assessment of phylogenies is becoming a popular alternative to direct likelihood optimization. However, MCMC, like maximum likelihood, is a computationallyexpensive method. To approximate the posterior distribution of phylogenies, a Markov chain is constructed, using the Metropolis algorithm, such that the chain has the posterior distribution of the parameters of phylogenies as its stationary distribution. This paper describes parallel algorithms and their MPIbased parallel implementation for MCMCbased Bayesian phylogenetic inference. Bayesian phylogenetic inference is computationally expensive both in time and in memory requirements. Our variations on MCMC and their implementation were done to permit the studyof large phylogenetic problems. In our approach, we can distribute either entire chains or parts of a chain to different processors, since in current models the columns of the data are independent. Evaluations on a 32node Beowulf cluster suggest the problem scales well. A number of important points are identified, including a superlinear speedup due to more effective cache usage and the point at which additional processors slow down the process due to communication overhead.
Computational grand challenges in assembling the tree of life: Problems & solutions
 THE IEEE AND ACM SUPERCOMPUTING CONFERENCE 2005 (SC2005) TUTORIAL
, 2005
"... The computation of ever larger as well as more accurate phylogenetic (evolutionary) trees with the ultimate goal to compute the tree of life represents one of the grand challenges in High Performance Computing (HPC) Bioinformatics. Unfortunately, the size of trees which can be computed in reasonable ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
The computation of ever larger as well as more accurate phylogenetic (evolutionary) trees with the ultimate goal to compute the tree of life represents one of the grand challenges in High Performance Computing (HPC) Bioinformatics. Unfortunately, the size of trees which can be computed in reasonable time based on elaborate evolutionary models is limited by the severe computational cost inherent to these methods. There exist two orthogonal research directions to overcome this challenging computational burden: First, the development of novel, faster, and more accurate heuristic algorithms and second, the application of high performance computing techniques. The goal of this chapter is to provide a comprehensive introduction to the field of computational evolutionary biology to an audience with computing background, interested in participating in research and/or commercial applications of this field. Moreover, we will cover leadingedge technical and algorithmic developments in the field and discuss open problems and potential solutions.
PBPI: a High Performance Implementation of Bayesian Phylogenetic Inference
 In Proc. of Supercomputing’2006
, 2006
"... This paper describes the implementation and performance of PBPI, a parallel implementation of Bayesian phylogenetic inference method for DNA sequence data. By combining the Markov Chain Monte Carlo (MCMC) method with likelihoodbased assessment of phylogenies, Bayesian phylogenetic inferences can in ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
This paper describes the implementation and performance of PBPI, a parallel implementation of Bayesian phylogenetic inference method for DNA sequence data. By combining the Markov Chain Monte Carlo (MCMC) method with likelihoodbased assessment of phylogenies, Bayesian phylogenetic inferences can incorporate complex statistic models into the process of phylogenetic tree estimation. However, Bayesian analyses are extremely computationally expensive. PBPI uses algorithmic improvements and parallel processing to achieve significant performance improvement over comparable Bayesian phylogenetic inference programs. We evaluated the performance and accuracy of PBPI using a simulated dataset on System X, a terascale supercomputer at Virginia Tech. Our results show that PBPI identifies equivalent tree estimates 1424 times faster on 256 processors than a widelyused, bestavailable (albeit sequential), Bayesian phylogenetic inference program. PBPI also achieves linear speedup with the number of processors for large problem sizes. Most importantly, the PBPI framework enables Bayesian phylogenetic analysis of large datasets previously impracticable. 1.
Reconstructing optimal phylogenetic trees: a challenge in experimental algorithmics
 Experimental Algorithmics, volume 2547 of Lecture Notes in Computer Science
, 2002
"... ..."
(Show Context)
Building the Tree of Life on Terascale Systems
 In Proc. of the 21st International Parallel and Distributed Processing Symposium
, 2007
"... Bayesian phylogenetic inference is an important alternative to maximum likelihoodbased phylogenetic method. However, inferring large trees using the Bayesian approach is computationally demanding—requiring huge amounts of memory and months of computational time. With a combination of novel parallel ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Bayesian phylogenetic inference is an important alternative to maximum likelihoodbased phylogenetic method. However, inferring large trees using the Bayesian approach is computationally demanding—requiring huge amounts of memory and months of computational time. With a combination of novel parallel algorithms and latest system technology, terascale phylogenetic tools will provide biologists the computational power necessary to conduct experiments on very large dataset, and thus aid construction of the tree of life. In this work we evaluate the performance of PBPI, a parallel application that reconstructs phylogenetic trees using MCMCbased Bayesian methods, on two terascale systems, Blue Gene/L at IBM Rochester and System X at Virginia Tech. Our results confirm that for a benchmark dataset with 218 taxa and 10000 characters, PBPI can achieve linear speedup on 1024 or more processors for both systems. 1.