Results 1  10
of
24
A LinearTime Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study
 Journal of Computational Biology
, 2001
"... Hannenhalli and Pevzner gave the first polynomialtime algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to dist ..."
Abstract

Cited by 109 (14 self)
 Add to MetaCart
Hannenhalli and Pevzner gave the first polynomialtime algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to distance calculation) proceeds in two stages: in the first stage, the overlap graph induced by the permutation is decomposed into connected components; then, in the second stage, certain graph structures (hurdles and others) are identified. Berman and Hannenhalli avoided the explicit computation of the overlap graph and gave an O(n alpha(n)) algorithm, based on a UnionFind structure, to find its connected components, where a is the inverse Ackerman function. Since for all practical purposes alpha(n) is a constant no larger than four, this algorithm has been the fastest practical algorithm to date. In this paper, we present a new lineartime algorithm for computing the connected components, which is more efficient than that of Berman and Hannenhalli in both theory and practice. Our algorithm uses only a stack and is very easy to implement. We give the results of computational experiments over a large range of permutation pairs produced through simulated evolution; our experiments show a speedup by a factor of 2 to 5 in the computation of the connected components and by a factor of 1.3 to 2 in the overall distance computation.
Performance study of phylogenetic methods: (unweighted) quartet methods and neighborjoining
, 2003
"... ..."
RecIDCM3: A fast algorithmic technique for reconstructing large phylogenetic trees
 In Proc. IEEE Computer Society Bioinformatics Conference (CSB 2004
, 2004
"... ..."
Industrial Applications of HighPerformance Computing for Phylogeny Reconstruction
, 2001
"... Phylogenies (that is, treeoflife relationships) derived from gene order data may prove crucial in answering some fundamental open questions in biomolecular evolution. Realworld interest is strong in determining these relationships. For example, pharmaceutical companies may use phylogeny reconstru ..."
Abstract

Cited by 29 (4 self)
 Add to MetaCart
Phylogenies (that is, treeoflife relationships) derived from gene order data may prove crucial in answering some fundamental open questions in biomolecular evolution. Realworld interest is strong in determining these relationships. For example, pharmaceutical companies may use phylogeny reconstruction in drug discovery for finding plants with similar gene production. Health organizations study the evolution and spread of viruses such as HIV to gain understanding of future outbreaks. And governments are interested in aiding the production of foodstuffs like rice, wheat, and corn, by understanding the genetic code. Yet very few techniques are available for such phylogenetic reconstructions. Appropriate tools for analyzing such data may help resolve some difficult phylogenetic reconstruction problems; indeed, this new source of data has been embraced by many biologists in their phylogenetic work. With the rapid accumulation of whole genome sequences for a wide diversity of taxa, phylogenetic reconstruction based on changes in gene order and gene content is showing promise, particularly for resolving deep (i.e., old) branches. However, reconstruction from geneorder data is even more computationally intensive than reconstruction from sequence data, particularly in groups with large numbers of genes and highly rearranged genomes. We have developed a software suite, GRAPPA, that extends the breakpoint analysis (BPAnalysis) method of Sankoff and Blanchette while running much faster: in a recent analysis of a collection of chloroplast data for species of Campanulaceae on a 512processor Linux supercluster with Myrinet, we achieved a onemillionfold speedup over BPAnalysis. GRAPPA currently can use either breakpoint or inversion distance (computed exactly) for its computati...
HighPerformance Algorithm Engineering for Computational Phylogenetics
 J. Supercomputing
, 2002
"... A phylogeny is the evolutionary history of a group of organisms; systematists (and other biologists) attempt to reconstruct this history from various forms of data about contemporary organisms. Phylogeny reconstruction is a crucial step in the understanding of evolution as well as an important tool ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
A phylogeny is the evolutionary history of a group of organisms; systematists (and other biologists) attempt to reconstruct this history from various forms of data about contemporary organisms. Phylogeny reconstruction is a crucial step in the understanding of evolution as well as an important tool in biological, pharmaceutical, and medical research. Phylogeny reconstruction from molecular data is very difficult: almost all optimization models give rise to NPhard (and thus computationally intractable) problems. Yet approximations must be of very high quality in order to avoid outright biological nonsense. Thus many biologists have been willing to run farms of processors for many months in order to analyze just one dataset. Highperformance algorithm engineering offers a battery of tools that can reduce, sometimes spectacularly, the running time of existing phylogenetic algorithms, as well as help designers produce better algorithms. We present an overview of algorithm engineering techniques, illustrating them with an application to the "breakpoint analysis" method of Sankoff et al., which resulted in the GRAPPA software suite. GRAPPA demonstrated a speedup in running time by over eight orders of magnitude over the original implementation on a variety of real and simulated datasets. We show how these algorithmic engineering techniques are directly applicable to a large variety of challenging combinatorial problems in computational biology.
Using PRAM Algorithms on a UniformMemoryAccess SharedMemory Architecture
 Proc. 5th Int’l Workshop on Algorithm Engineering (WAE 2001), volume 2141 of Lecture Notes in Computer Science
, 2001
"... The ability to provide uniform sharedmemory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. In this paper, we develop new techniques for designing a uniform sharedmemory algorithm from a PRAM algorithm and present the res ..."
Abstract

Cited by 20 (11 self)
 Add to MetaCart
The ability to provide uniform sharedmemory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. In this paper, we develop new techniques for designing a uniform sharedmemory algorithm from a PRAM algorithm and present the results of an extensive experimental study demonstrating that the resulting programs scale nearly linearly across a significant range of processors (from 1 to 64) and across the entire range of instance sizes tested. This linear speedup with the number of processors is, to our knowledge, the first ever attained in practice for intricate combinatorial problems. The example we present in detail here is a graph decomposition algorithm that also requires the computation of a spanning tree; this problem is not only of interest in its own right, but is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but have no known efficient parallel implementations. Our results thus offer promise for bridging the gap between the theory and practice of sharedmemory parallel algorithms.
Performance of supertree methods on various dataset decompositions
 Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, volume 3 of Computational Biology
, 2004
"... Many largescale phylogenetic reconstruction methods attempt to solve hard optimization problems (such as Maximum Parsimony (MP) and Maximum Likelihood (ML)), but they are severely limited by the number of taxa that they can handle in a reasonable time frame. A standard heuristic approach to this pr ..."
Abstract

Cited by 17 (9 self)
 Add to MetaCart
Many largescale phylogenetic reconstruction methods attempt to solve hard optimization problems (such as Maximum Parsimony (MP) and Maximum Likelihood (ML)), but they are severely limited by the number of taxa that they can handle in a reasonable time frame. A standard heuristic approach to this problem is the divideandconquer strategy: decompose the dataset into smaller subsets, solve the subsets (i.e., use MP or ML on each subset to obtain trees), then combine the solutions to the subsets into a solution to the original dataset. This last step, combining given trees into a single tree, is known as supertree construction in computational phylogenetics. The traditional application of supertree methods is to combine existing, published phylogenies into a single phylogeny. Here, we study supertree construction in the context of divideandconquer methods for largescale tree reconstruction. We study several divideandconquer approaches and experimentally demonstrate their advantage over Matrix Representation Parsimony (MRP), a traditional supertree technique, and over global heuristics such as the parsimony ratchet. On the ten large biological datasets under investigation, our study shows that the techniques used for dividing the dataset into subproblems as well as those used for merging them into a single solution strongly influence the quality of the supertree construction. In most cases, our merging technique—the Strict Consensus Merger (SCM)—outperforms MRP with respect to MP scores and running time. Divideandconquer techniques are also a highly competitive alternative to global heuristics such as the parsimony ratchet, especially on the more challenging datasets. 1
Computing Tutte Polynomials
 THE FIRST CENSUS ON THE TERTIARY INDUSTRY IN CHINA: SUMMARY STATISTICS, CHINA STATISTICAL
, 1996
"... The Tutte polynomial of a graph, also known as the partition function of the qstate Potts model, is a 2variable polynomial graph invariant of considerable importance in both combinatorics and statistical physics. It contains several other polynomial invariants, such as the chromatic polynomial an ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
The Tutte polynomial of a graph, also known as the partition function of the qstate Potts model, is a 2variable polynomial graph invariant of considerable importance in both combinatorics and statistical physics. It contains several other polynomial invariants, such as the chromatic polynomial and flow polynomial as partial evaluations, and various numerical invariants such as the number of spanning trees as complete evaluations. However despite its ubiquity, there are no widelyavailable effective computational tools able to compute the Tutte polynomial of a general graph of reasonable size. In this paper we describe the implementation of a program that exploits isomorphisms in the computation tree to extend the range of graphs for which it is feasible to compute their Tutte polynomials. We also consider edgeselection heuristics which give good performance in practice. We empirically demonstrate the utility of our program on random graphs. More evidence of its usefulness arises from our success in finding counterexamples to a conjecture of Welsh on the location of the real flow roots of a graph.