Results 1 -
6 of
6
A framework for orthology assignment from gene rearrangement data
- Data,” Lecture Notes in Bioinformatics
, 2005
"... Abstract. Gene rearrangements have successfully been used in phylogenetic reconstruction and comparative genomics, but usually under the assumption that all genomes have the same gene content and that no gene is duplicated. While these assumptions allow one to work with organellar genomes, they are ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract. Gene rearrangements have successfully been used in phylogenetic reconstruction and comparative genomics, but usually under the assumption that all genomes have the same gene content and that no gene is duplicated. While these assumptions allow one to work with organellar genomes, they are too restrictive when comparing nuclear genomes. The main challenge is how to deal with gene families, specifically, how to identify orthologs. While searching for orthologies is a common task in computational biology, it is usually done using sequence data. We approach that problem using gene rearrangement data, provide an optimization framework in which to phrase the problem, and present some preliminary theoretical results. 1
Quartet methods for phylogeny reconstruction from gene orders
- Dept. CS and Engin., Univ. South-Carolina
, 2005
"... Abstract. Phylogenetic reconstruction from gene-rearrangement data has attracted increasing attention from biologists and computer scientists. Methods used in reconstruction include distance-based methods, parsimony methods using sequence-based encodings, and direct optimization. The latter, pioneer ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Abstract. Phylogenetic reconstruction from gene-rearrangement data has attracted increasing attention from biologists and computer scientists. Methods used in reconstruction include distance-based methods, parsimony methods using sequence-based encodings, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach; however, its exhaustive approach means that it can be applied only to small datasets of fewer than 15 taxa. While we have successfully scaled it up to 1,000 genomes by integrating it with a diskcovering method (DCM-GRAPPA), the recursive decomposition may need many levels of recursion to handle datasets with 1,000 or more genomes. We thus investigated quartet-based approaches, which directly decompose the datasets into subsets of four taxa each; such approaches have been well studied for sequence data, but not for gene-rearrangement data. We give an optimization algorithm for the NP-hard problem of computing optimal trees for each quartet, present a variation of the dyadic method (using heuristics to choose suitable short quartets), and use both in simulation studies. We find that our quartet-based method can handle more genomes than the base version of GRAPPA, thus enabling us to reduce the number of levels of recursion in DCM-GRAPPA, but is more sensitive to the rate of evolution, with error rates rapidly increasing when saturation is approached. 1
Lower Bounds for Maximum Parsimony with Gene Order Data
- RECOMB Comparative Genomics
, 2005
"... Abstract. In this paper, we study lower bound techniques for branchand-bound algorithms for maximum parsimony, with a focus on gene order data. We give a simple O(n 3) time dynamic programming algorithm for computing the maximum circular ordering lower bound, where n is the number of leaves. The wel ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. In this paper, we study lower bound techniques for branchand-bound algorithms for maximum parsimony, with a focus on gene order data. We give a simple O(n 3) time dynamic programming algorithm for computing the maximum circular ordering lower bound, where n is the number of leaves. The well-known gene order phylogeny program, GRAPPA, currently implements two heuristic approximations to this lower bounds. Our experiments show a significant improvement over both these methods in practice. Next, we show that the linear programmingbased lower bound of Tang and Moret (Tang and Moret, 2005) can be greatly simplified, allowing us to solve the LP in O ∗ n 3) time in the worst case, and in O ∗ (n 2.5) time amortized over all binary trees. Finally, we formalize the problem of computing the circular ordering lower bound, when the tree topologies are generated bottom-up, as a Path-Constrained Traveling Salesman Problem, and give a polynomial-time 3-approximation algorithm for it. This is a special case of the more general Precedence-Constrained Travelling Salesman Problem and has not previously been studied, to the best of our knowledge. 1
Intelligence in Bioinformatics and Computational Biology (CIBCB 2007) An Experimental Evaluation of Inversion- and Transposition-Based Genomic Distances through Simulations
"... Abstract — Rearrangements of genes and other syntenic blocks have become a topic of intensive study by phylogenists, comparative genomicists, and computational biologists: they are a feature of many cancers, must be taken into account to align highly divergent sequences, and constitute a phylogeneti ..."
Abstract
- Add to MetaCart
Abstract — Rearrangements of genes and other syntenic blocks have become a topic of intensive study by phylogenists, comparative genomicists, and computational biologists: they are a feature of many cancers, must be taken into account to align highly divergent sequences, and constitute a phylogenetic marker of great interest. The mathematics of rearrangements is far more complex than for indels and mutations in sequences. Inversions have been well characterized through 20 years of work, but transpositions still await comparable results. We can compute inversion and DCJ (a combination of inversions and block exchanges) distances, and bounds on the transposition distance. The first has been extensively used in comparative genomics and phylogenetics, the second is quite new, and the third has not seen significant use to date. We present here a detailed experimental study of these three distance measures within the context of genome comparison (pairwise distances) and phylogenetic reconstruction. We used data generated through simulated evolution along various trees, using various evolutionary rates and various mixes of inversions and transpositions. Our main finding is that inversion and DCJ measures return very similar results even on data generated using only transpositions, while the measure based on Hartman’s bound is often too loose to provide comparable accuracy in genomic comparisons or phylogenetic reconstruction. I.
BMC Bioinformatics BioMed Central Methodology article
, 2008
"... A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions ..."
Abstract
- Add to MetaCart
A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions

