Results 1  10
of
64
A LinearTime Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study
 Journal of Computational Biology
, 2001
"... Hannenhalli and Pevzner gave the first polynomialtime algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to dist ..."
Abstract

Cited by 148 (15 self)
 Add to MetaCart
Hannenhalli and Pevzner gave the first polynomialtime algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to distance calculation) proceeds in two stages: in the first stage, the overlap graph induced by the permutation is decomposed into connected components; then, in the second stage, certain graph structures (hurdles and others) are identified. Berman and Hannenhalli avoided the explicit computation of the overlap graph and gave an O(n alpha(n)) algorithm, based on a UnionFind structure, to find its connected components, where a is the inverse Ackerman function. Since for all practical purposes alpha(n) is a constant no larger than four, this algorithm has been the fastest practical algorithm to date. In this paper, we present a new lineartime algorithm for computing the connected components, which is more efficient than that of Berman and Hannenhalli in both theory and practice. Our algorithm uses only a stack and is very easy to implement. We give the results of computational experiments over a large range of permutation pairs produced through simulated evolution; our experiments show a speedup by a factor of 2 to 5 in the computation of the connected components and by a factor of 1.3 to 2 in the overall distance computation.
Steps Toward Accurate Reconstructions of Phylogenies from GeneOrder Data
 J. COMPUT. SYST. SCI
, 2002
"... ..."
(Show Context)
Reconstructing Phylogenies from GeneContent and GeneOrder Data
 MATHEMATICS OF EVOLUTION AND PHYLOGENY, OLIVIER GASCUEL (ED.)
"... ..."
Scaling up accurate phylogenetic reconstruction from geneorder data
, 2002
"... Motivation: Phylogenetic reconstruction from geneorder data has attracted increasing attention from both biologists and computer scientists over the last few years. Methods used in reconstruction include distancebased methods (such as neighborjoining), parsimony methods using sequencebased encod ..."
Abstract

Cited by 36 (14 self)
 Add to MetaCart
Motivation: Phylogenetic reconstruction from geneorder data has attracted increasing attention from both biologists and computer scientists over the last few years. Methods used in reconstruction include distancebased methods (such as neighborjoining), parsimony methods using sequencebased encodings, Bayesian approaches, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach, but cannot handle more than about 15 genomes of limited size (e.g., organelles). Results: We report here on our successful efforts to scale up direct optimization through a twostep approach: the first step decomposes the dataset into smaller pieces and runs the direct optimization (GRAPPA) on the smaller pieces, while the second step builds a tree from the results obtained on the smaller pieces. We used the sophisticated diskcovering method (DCM) pioneered by Warnow and her group, suitably modified to take into account the computational limitations of GRAPPA. We find that DCMGRAPPA scales gracefully to at least 1,000 genomes of a few hundred genes each and retains surprisingly high accuracy throughout the range: in our experiments, the topological error rate rarely exceeded a few percent. Thus, reconstruction based on geneorder data can now be accomplished with high accuracy on datasets of significant size. Availability: All of our software is available in source form under GPL at www.compbio.unm.edu Contact:
Inversion medians outperform breakpoint medians in phylogeny reconstruction from geneorder data
, 2002
"... ..."
(Show Context)
Industrial Applications of HighPerformance Computing for Phylogeny Reconstruction
, 2001
"... Phylogenies (that is, treeoflife relationships) derived from gene order data may prove crucial in answering some fundamental open questions in biomolecular evolution. Realworld interest is strong in determining these relationships. For example, pharmaceutical companies may use phylogeny reconstru ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
Phylogenies (that is, treeoflife relationships) derived from gene order data may prove crucial in answering some fundamental open questions in biomolecular evolution. Realworld interest is strong in determining these relationships. For example, pharmaceutical companies may use phylogeny reconstruction in drug discovery for finding plants with similar gene production. Health organizations study the evolution and spread of viruses such as HIV to gain understanding of future outbreaks. And governments are interested in aiding the production of foodstuffs like rice, wheat, and corn, by understanding the genetic code. Yet very few techniques are available for such phylogenetic reconstructions. Appropriate tools for analyzing such data may help resolve some difficult phylogenetic reconstruction problems; indeed, this new source of data has been embraced by many biologists in their phylogenetic work. With the rapid accumulation of whole genome sequences for a wide diversity of taxa, phylogenetic reconstruction based on changes in gene order and gene content is showing promise, particularly for resolving deep (i.e., old) branches. However, reconstruction from geneorder data is even more computationally intensive than reconstruction from sequence data, particularly in groups with large numbers of genes and highly rearranged genomes. We have developed a software suite, GRAPPA, that extends the breakpoint analysis (BPAnalysis) method of Sankoff and Blanchette while running much faster: in a recent analysis of a collection of chloroplast data for species of Campanulaceae on a 512processor Linux supercluster with Myrinet, we achieved a onemillionfold speedup over BPAnalysis. GRAPPA currently can use either breakpoint or inversion distance (computed exactly) for its computati...
Finding an optimal inversion median: experimental results
 In Proc. 1st Workshop on Algs. in Bioinformatics WABI 2001
, 2001
"... Abstract. We derive a branchandbound algorithm to find an optimal inversion median of three signed permutations. The algorithm prunes to manageable size an extremely large search tree using simple geometric properties of the problem and a newly available lineartime routine for inversion distance. ..."
Abstract

Cited by 29 (10 self)
 Add to MetaCart
(Show Context)
Abstract. We derive a branchandbound algorithm to find an optimal inversion median of three signed permutations. The algorithm prunes to manageable size an extremely large search tree using simple geometric properties of the problem and a newly available lineartime routine for inversion distance. Our experiments on simulated data sets indicate that the algorithm finds optimal medians in reasonable time for genomes of medium size when distances are not too large, as commonly occurs in phylogeny reconstruction. In addition, we have compared inversion and breakpoint medians, and found that inversion medians generally score significantly better and tend to be far more unique, which should make them valuable in medianbased treebuilding algorithms. 1
DUPCAR: Reconstructing contiguous ancestral regions with duplications
 Journal of Computational Biology
"... Accurately reconstructing the largescale gene order in an ancestral genome is a critical step to better understand genome evolution. In this paper, we propose a heuristic algorithm, called DUPCAR, for reconstructing ancestral genomic orders with duplications. The method starts from the order of gen ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
(Show Context)
Accurately reconstructing the largescale gene order in an ancestral genome is a critical step to better understand genome evolution. In this paper, we propose a heuristic algorithm, called DUPCAR, for reconstructing ancestral genomic orders with duplications. The method starts from the order of genes in modern genomes and predicts predecessor and successor relationships in the ancestor. Then a greedy algorithm is used to reconstruct the ancestral orders by connecting genes into contiguous regions based on predicted adjacencies. Computer simulation was used to validate the algorithm. We also applied the method to reconstruct the ancestral chromosome X of placental mammals and the ancestral genomes of the ciliate Paramecium tetraurelia. Key words: contiguous ancestral region, duplication, geneorder reconstruction, genome rearrangement, isometric reconciliation.
Reconstructing Ancestral Gene Orders Using Conserved Intervals
 Proc. Fourth Int’l Workshop Algorithms in Bioinformatics (WABI ’04
, 2004
"... Abstract. Conserved intervals were recently introduced as a measure of similarity between genomes whose genes have been shuffled during evolution by genomic rearrangements. Phylogenetic reconstruction based on such similarity measures raises many biological, formal and algorithmic questions, in part ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
(Show Context)
Abstract. Conserved intervals were recently introduced as a measure of similarity between genomes whose genes have been shuffled during evolution by genomic rearrangements. Phylogenetic reconstruction based on such similarity measures raises many biological, formal and algorithmic questions, in particular the labelling of internal nodes with putative ancestral gene orders, and the selection of a good tree topology. In this paper, we investigate the properties of sets of permutations associated to conserved intervals as a representation of putative ancestral gene orders for a given tree topology. We define settheoretic operations on sets of conserved intervals, together with the associated algorithms, and we apply these techniques, in a manner similar to the FitchHartigan algorithm for parsimony, to a subset of chloroplast genes of 13 species. 1