• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A linear-time algorithm for computing inversion distance between signed permutations with an experimental study (2001)

by D A Bader, B M E Moret, M Yan
Venue:Journal of Computational Biology
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 68
Next 10 →

A Very Elementary Presentation of the Hannenhalli-Pevzner Theory

by Anne Bergeron , 2003
"... In 1995, Hannenhalli and Pevzner gave a first polynomial solution to the problem of finding the minimum number of reversals needed to sort a signed permutation. Their solution, as well as subsequent ones, relies on many intermediary constructions, such as simulations with permutations on 2n elem ..."
Abstract - Cited by 51 (5 self) - Add to MetaCart
In 1995, Hannenhalli and Pevzner gave a first polynomial solution to the problem of finding the minimum number of reversals needed to sort a signed permutation. Their solution, as well as subsequent ones, relies on many intermediary constructions, such as simulations with permutations on 2n elements, and manipulation of various graphs.

Steps Toward Accurate Reconstructions of Phylogenies from Gene-Order Data

by Bernard M.E. Moret, Jijun Tang, Li-San Wang, Tandy Warnow - J. COMPUT. SYST. SCI , 2002
"... ..."
Abstract - Cited by 39 (16 self) - Add to MetaCart
Abstract not found

BioPerf: A benchmark suite to evaluate high-performance computer architecture on bioinformatics applications

by David A. Bader, Yue Li, Tao Li, Vipin Sachdeva - In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC
"... The exponential growth in the amount of genomic data has spurred growing interest in large scale analysis of genetic information. Bioinformatics applications, which explore computational methods to allow researchers to sift through the massive biological data and extract useful information, are beco ..."
Abstract - Cited by 32 (4 self) - Add to MetaCart
The exponential growth in the amount of genomic data has spurred growing interest in large scale analysis of genetic information. Bioinformatics applications, which explore computational methods to allow researchers to sift through the massive biological data and extract useful information, are becoming increasingly important computer workloads. This paper presents BioPerf, a benchmark suite of representative bioinformatics applications to facilitate the design and evaluation of highperformance computer architectures for these emerging workloads. Currently, the BioPerf suite contains codes from 10 highly popular bioinformatics packages and covers the major fields of study in computational biology such as sequence comparison, phylogenetic reconstruction, protein structure prediction, and sequence homology & gene finding. We demonstrate the use of BioPerf by providing simulation points of pre-compiled Alpha binaries and with a performance study on IBM Power using IBM Mambo simulations cross-compared with Apple G5 executions. The BioPerf suite (available from www.bioperf.org) includes benchmark source code, input datasets of various sizes, and information for compiling and using the benchmarks. Our benchmark suite includes parallel codes where available. 1.

Assignment of orthologous genes via genome rearrangement

by Xin Chen, Jie Zheng, Zheng Fu, Peng Nan, Yang Zhong, Stefano Lonardi, Tao Jiang - IEEE/ACM Transactions on Computational Biology and Bioinformatics , 2005
"... Abstract—The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not cl ..."
Abstract - Cited by 28 (3 self) - Add to MetaCart
Abstract—The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not clearly delineate the evolutionary relationship among genes of the same families. In this paper, we present a new approach to ortholog assignment that takes into account both sequence similarity and evolutionary events at a genome level, where orthologous genes are assumed to correspond to each other in the most parsimonious evolving scenario under genome rearrangement. First, the problem is formulated as that of computing the signed reversal distance with duplicates between the two genomes of interest. Then, the problem is decomposed into two new optimization problems, called minimum common partition and maximum cycle decomposition, for which efficient heuristic algorithms are given. Following this approach, we have implemented a highthroughput system for assigning orthologs on a genome scale, called SOAR, and tested it on both simulated data and real genome sequence data. Compared to a recent ortholog assignment method based entirely on homology search (called INPARANOID), SOAR shows a marginally better performance in terms of sensitivity on the real data set because it is able to identify several correct orthologous pairs that are missed by INPARANOID. The simulation results demonstrate that SOAR, in general, performs better than the iterated exemplar algorithm in terms of computing the reversal distance and assigning correct orthologs. Index Terms—Ortholog, paralog, gene duplication, genome rearrangement, reversal, comparative genomics. 1

On the Similarity of Sets of Permutations and its Applications to Genome Comparison

by Anne Bergeron, Jens Stoye , 2003
"... The comparison of genomes with the same gene content relies on our ability to compare permutations, either by measuring how much they di#er, or by measuring how much they are alike. With the notable exception of the breakpoint distance, which is based on the concept of conserved adjacencies, meas ..."
Abstract - Cited by 26 (6 self) - Add to MetaCart
The comparison of genomes with the same gene content relies on our ability to compare permutations, either by measuring how much they di#er, or by measuring how much they are alike. With the notable exception of the breakpoint distance, which is based on the concept of conserved adjacencies, measures of distance do not generalize easily to sets of more than two permutations. In this paper, we present a basic unifying notion, conserved intervals, as a powerful generalization of adjacencies, and as a key feature of genome rearrangement theories. We also show that sets of conserved intervals have elegant nesting and chaining properties that allow the development of compact graphic representations, and linear time algorithms to manipulate them.

Finding an optimal inversion median: experimental results

by Adam C. Siepel, Bernard M. E. Moret - In Proc. 1st Workshop on Algs. in Bioinformatics WABI 2001 , 2001
"... Abstract. We derive a branch-and-bound algorithm to find an optimal inversion median of three signed permutations. The algorithm prunes to manageable size an extremely large search tree using simple geometric properties of the problem and a newly available linear-time routine for inversion distance. ..."
Abstract - Cited by 23 (10 self) - Add to MetaCart
Abstract. We derive a branch-and-bound algorithm to find an optimal inversion median of three signed permutations. The algorithm prunes to manageable size an extremely large search tree using simple geometric properties of the problem and a newly available linear-time routine for inversion distance. Our experiments on simulated data sets indicate that the algorithm finds optimal medians in reasonable time for genomes of medium size when distances are not too large, as commonly occurs in phylogeny reconstruction. In addition, we have compared inversion and breakpoint medians, and found that inversion medians generally score significantly better and tend to be far more unique, which should make them valuable in median-based tree-building algorithms. 1

A 1.375-Approximation Algorithm for Sorting by Transpositions

by Isaac Elias, Tzvika Hartman - Proceedings of 5th Workshop on Algorithms in Bioinformatics (WABI’05), LNBI 3692, 2005 , 2005
"... Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a di#erent location. The complexity of this problem is still open and it has been a ten-year-old op ..."
Abstract - Cited by 22 (0 self) - Add to MetaCart
Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a di#erent location. The complexity of this problem is still open and it has been a ten-year-old open problem to improve the best known 1.5-approximation algorithm. In this paper we provide a 1.375-approximation algorithm for sorting by transpositions. The algorithm is based on a new upper bound on the diameter of 3-permutations. In addition, we present some new results regarding the transposition diameter: We improve the lower bound for the transposition diameter of the symmetric group, and determine the exact transposition diameter of 2-permutations and simple permutations.

Perfect sorting by reversals is not always difficult

by Sèverine Bérard, Anne Bergeron, Cedric Chauve, Christophe Paul - IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS , 2007
"... We propose new algorithms for computing pairwise rearrangement scenarios that conserve the combinatorial structure of genomes. More precisely, we investigate the problem of sorting signed permutations by reversals without breaking common intervals. We describe a combinatorial framework for this prob ..."
Abstract - Cited by 19 (10 self) - Add to MetaCart
We propose new algorithms for computing pairwise rearrangement scenarios that conserve the combinatorial structure of genomes. More precisely, we investigate the problem of sorting signed permutations by reversals without breaking common intervals. We describe a combinatorial framework for this problem that allows us to characterize classes of signed permutations for which one can compute, in polynomial time, a shortest reversal scenario that conserves all common intervals. In particular, we define a class of permutations for which this computation can be done in linear time with a very simple algorithm that does not rely on the classical Hannenhalli-Pevzner theory for sorting by reversals. We apply these methods to the computation of rearrangement scenarios between permutations obtained from 16 synteny blocks of the X chromosomes of the human, mouse, and rat.

Sequence Distance Embeddings

by Graham Cormode , 2003
"... ..."
Abstract - Cited by 17 (1 self) - Add to MetaCart
Abstract not found

Phylogenetic reconstruction from gene rearrangement data with unequal gene contents

by Jijun Tang, Bernard M. E. Moret - in Algorithms and Data Structures, 8th International Workshop, WADS 2003 , 2003
"... Abstract. Phylogenetic reconstruction from gene-rearrangement data has seen increased attention over the last five years. Existing methods are limited computationally and by the assumption (highly unrealistic in practice) that all genomes have the same gene content. We have recently shown that we ca ..."
Abstract - Cited by 15 (2 self) - Add to MetaCart
Abstract. Phylogenetic reconstruction from gene-rearrangement data has seen increased attention over the last five years. Existing methods are limited computationally and by the assumption (highly unrealistic in practice) that all genomes have the same gene content. We have recently shown that we can scale our reconstruction tool, GRAPPA, to instances with up to a thousand genomes with no loss of accuracy and at minimal computational cost. Computing genomic distances between two genomes with unequal gene contents has seen much progress recently, but that progress has not yet been reflected in phylogenetic reconstruction methods. In this paper, we present extensions to our GRAPPA approach that can handle limited numbers of duplications (one of the main requirements for analyzing genomic data from organelles) and a few deletions. Although GRAPPA is based on exhaustive search, we show that, in practice, our bounding functions suffice to prune away almost all of the search space (our pruning rates never fall below 99.995%), resulting in high accuracy and fast running times. The range of values within which we have tested our approach encompasses mitochondria and chloroplast organellar genomes, whose phylogenetic analysis is providing new insights on evolution. Keywords computational biology, phylogenetic reconstruction, gene-order data, whole-genome data, signed
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University