Results 1  10
of
64
A LinearTime Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study
 Journal of Computational Biology
, 2001
"... Hannenhalli and Pevzner gave the first polynomialtime algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to dist ..."
Abstract

Cited by 151 (15 self)
 Add to MetaCart
(Show Context)
Hannenhalli and Pevzner gave the first polynomialtime algorithm for computing the inversion distance between two signed permutations, as part of the larger task of determining the shortest sequence of inversions needed to transform one permutation into the other. Their algorithm (restricted to distance calculation) proceeds in two stages: in the first stage, the overlap graph induced by the permutation is decomposed into connected components; then, in the second stage, certain graph structures (hurdles and others) are identified. Berman and Hannenhalli avoided the explicit computation of the overlap graph and gave an O(n alpha(n)) algorithm, based on a UnionFind structure, to find its connected components, where a is the inverse Ackerman function. Since for all practical purposes alpha(n) is a constant no larger than four, this algorithm has been the fastest practical algorithm to date. In this paper, we present a new lineartime algorithm for computing the connected components, which is more efficient than that of Berman and Hannenhalli in both theory and practice. Our algorithm uses only a stack and is very easy to implement. We give the results of computational experiments over a large range of permutation pairs produced through simulated evolution; our experiments show a speedup by a factor of 2 to 5 in the computation of the connected components and by a factor of 1.3 to 2 in the overall distance computation.
GRIMM: genome rearrangements web server
 Bioinformatics
, 2002
"... Summary: GRIMM is a tool for analyzing rearrangements of gene orders in pairs of unichromosomal and multichromosomal genomes, with either signed or unsigned gene data. Although there are several programs for analyzing rearrangements in unichromosomal genomes, this is the ¯rst to analyze rearrangem ..."
Abstract

Cited by 77 (5 self)
 Add to MetaCart
(Show Context)
Summary: GRIMM is a tool for analyzing rearrangements of gene orders in pairs of unichromosomal and multichromosomal genomes, with either signed or unsigned gene data. Although there are several programs for analyzing rearrangements in unichromosomal genomes, this is the ¯rst to analyze rearrangements in multichromosomal genomes. GRIMM also provides a new algorithm for analyzing comparative maps for which gene directions are unknown. Availability: A web server, with instructions and sample data, is available at
Formulations and Hardness of Multiple Sorting by Reversals
 Proc. 3rd Conf. Computational Molecular Biology RECOMB99, ACM
, 1998
"... We consider two generalizations of signed Sorting By Reversals (SBR), both aimed at formalizing the problem of reconstructing the evolutionary history of a set of species. In particular, we address Multiple SBR, calling for a signed permutation at minimum reversal distance from a given set of signed ..."
Abstract

Cited by 54 (1 self)
 Add to MetaCart
(Show Context)
We consider two generalizations of signed Sorting By Reversals (SBR), both aimed at formalizing the problem of reconstructing the evolutionary history of a set of species. In particular, we address Multiple SBR, calling for a signed permutation at minimum reversal distance from a given set of signed permutations, and Tree SBR, calling for a tree with the minimum number of edges spanning a given set of nodes in the complete graph where each node corresponds to a signed permutation and there is an edge between each pair of signed permutations one reversal away from each other. We describe a graphtheoretic relaxation of MSBR, which is the counterpart of the socalled alternatingcycle decomposition relaxation for SBR, illustrating a convenient mathematical formulation for this relaxation. Moreover, we use this relaxation to show that, even if the number of given permutations equals 3, MSBR is NPhard, and hence so is Tree SBR. In fact, we show that the two problems are APXhard, i.e. the...
Assignment of orthologous genes via genome rearrangement
 IEEE/ACM Transactions on Computational Biology and Bioinformatics
, 2005
"... Abstract—The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not cl ..."
Abstract

Cited by 47 (4 self)
 Add to MetaCart
Abstract—The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not clearly delineate the evolutionary relationship among genes of the same families. In this paper, we present a new approach to ortholog assignment that takes into account both sequence similarity and evolutionary events at a genome level, where orthologous genes are assumed to correspond to each other in the most parsimonious evolving scenario under genome rearrangement. First, the problem is formulated as that of computing the signed reversal distance with duplicates between the two genomes of interest. Then, the problem is decomposed into two new optimization problems, called minimum common partition and maximum cycle decomposition, for which efficient heuristic algorithms are given. Following this approach, we have implemented a highthroughput system for assigning orthologs on a genome scale, called SOAR, and tested it on both simulated data and real genome sequence data. Compared to a recent ortholog assignment method based entirely on homology search (called INPARANOID), SOAR shows a marginally better performance in terms of sensitivity on the real data set because it is able to identify several correct orthologous pairs that are missed by INPARANOID. The simulation results demonstrate that SOAR, in general, performs better than the iterated exemplar algorithm in terms of computing the reversal distance and assigning correct orthologs. Index Terms—Ortholog, paralog, gene duplication, genome rearrangement, reversal, comparative genomics. 1
A 1.375Approximation Algorithm for Sorting by Transpositions
"... Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a different location. The complexity of this problem is still open and it has been a tenyearold ope ..."
Abstract

Cited by 46 (2 self)
 Add to MetaCart
Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a different location. The complexity of this problem is still open and it has been a tenyearold open problem to improve the best known 1.5approximation algorithm. In this paper we provide a 1.375approximation algorithm for sorting by transpositions. The algorithm is based on a new upper bound on the diameter of 3permutations. In addition, we present some new results regarding the transposition diameter: We improve the lower bound for the transposition diameter of the symmetric group, and determine the exact transposition diameter of 2permutations and simple permutations.
A 1.5Approximation Algorithm for Sorting by Transpositions and Transreversals
, 2003
"... One of the most promising ways to determine evolutionary distance between two organisms is to compare the order of appearance of orthologous genes in their genomes. The resulting genome rearrangement problem calls for finding a shortest sequence of rearrangement operations that sorts one genome into ..."
Abstract

Cited by 45 (2 self)
 Add to MetaCart
One of the most promising ways to determine evolutionary distance between two organisms is to compare the order of appearance of orthologous genes in their genomes. The resulting genome rearrangement problem calls for finding a shortest sequence of rearrangement operations that sorts one genome into the other. In this paper we provide a 1.5approximation algorithm for the problem of sorting by transpositions and transreversals, improving on a fiveyearsold 1.75 ratio for this problem. Our algorithm is also faster than current approaches and requires O(n # log n) time for n genes.
On the Tightness of the AlternatingCycle Lower Bound for Sorting by Reversals
 Journal of Combinatorial Optimization
, 1998
"... We give a theoretical answer to a natural question arising from a few years of computational experiments on the problem of sorting a permutation by the minimum number of reversals, which has relevant applications in computational molecular biology. The experiments carried out on the problem showe ..."
Abstract

Cited by 27 (8 self)
 Add to MetaCart
(Show Context)
We give a theoretical answer to a natural question arising from a few years of computational experiments on the problem of sorting a permutation by the minimum number of reversals, which has relevant applications in computational molecular biology. The experiments carried out on the problem showed that the socalled alternatingcycle lower bound is equal to the optimal solution value in almost all cases, and this is the main reason why the stateoftheart algorithms for the problem are quite effective in practice. Since worstcase analysis cannot give an adequate justification for this observation, we focus our attention on estimating the probability that, for a random permutation of n elements, the above lower bound is not tight. We show that this probability is low even for small n, and asymptotically O(1=n 5 ). This gives a satisfactory explanation to empirical observations and shows that the problem of sorting by reversals and its alternatingcycle relaxation are essentially the same problem, with the exception of a small fraction of "pathological" instances, justifying the use of algorithms which are heavily based on this relaxation. From our analysis we obtain convenient sufficient conditions to test if the alternatingcycle lower bound is tight for a given instance. We also consider the case of signed permutations, for which the analysis is much simpler, and show that the probability that the alternatingcycle lower bound is not tight for a random signed permutation of m elements is asymptotically O(1=m 2 ). 1
Opportunities for Combinatorial Optimization In Computational Biology
, 2003
"... This is a survey designed for mathematical programming people who do not know molecular biology and want to learn the kinds of combinatorial optimization problems that arise. After a brief introduction to the biology, we present optimization models pertaining to sequencing, evolutionary explanations ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
This is a survey designed for mathematical programming people who do not know molecular biology and want to learn the kinds of combinatorial optimization problems that arise. After a brief introduction to the biology, we present optimization models pertaining to sequencing, evolutionary explanations, structure prediction and recognition. Additional biology is given in the context of the problems, including some motivation for disease diagnosis and drug discovery. Open problems are cited with an extensive bibliography, and we offer a guide to getting started in this exciting frontier.
Empirical Analysis of Locality, Heritability and Heuristic Bias in Evolutionary Algorithms: A Case Study for the Multidimensional Knapsack Problem
, 2004
"... Five different representations and associated variation operators are studied in the context of a steadystate evolutionary algorithm (EA) for the multidimensional knapsack problem. Four of them are indirect decoderbased techniques, and the fifth is a direct encoding including heuristic initializa ..."
Abstract

Cited by 23 (6 self)
 Add to MetaCart
Five different representations and associated variation operators are studied in the context of a steadystate evolutionary algorithm (EA) for the multidimensional knapsack problem. Four of them are indirect decoderbased techniques, and the fifth is a direct encoding including heuristic initialization, repair, and local improvement. The complex decoders and the local improvement and repair strategies make it practically impossible to completely analyze such EAs in a fully theoretical way. After comparing the general performance of the EA variants on two benchmark suites, we present a handson approach for empirically analyzing important aspects of initialization, mutation, and crossover in an isolated fashion. Static, inexpensive measurements based on randomly created solutions are performed in order to quantify and visualize specific properties with respect to heuristic bias, locality, and heritability. These tests shed light onto the complex behavior of such EAs and point out reasons for good or bad performance. In addition, the proposed measures are also examined during actual EA runs, which gives further insight into dynamic aspects of evolutionary search and verifies the validity of the isolated static measurements. All measurements are described in a general way, allowing for an easy adaption to other representations and combinatorial problems.
The Reversal Median Problem
 INFORMS Journal on Computing
, 2003
"... In this paper, we study the Reversal Median Problem (RMP), which arises in computational biology and is a basic model for the reconstruction of evolutionary trees. Given q genomes, RMP calls for another genome such that the sum of the reversal distances between this genome and the given ones is mini ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
(Show Context)
In this paper, we study the Reversal Median Problem (RMP), which arises in computational biology and is a basic model for the reconstruction of evolutionary trees. Given q genomes, RMP calls for another genome such that the sum of the reversal distances between this genome and the given ones is minimized. So far, the problem was considered too complex to derive mathematical models useful for its analysis and solution. We provide a powerful graph theoretic relaxation of RMP, essentially calling for a perfect matching in a graph that forms the maximum number of cycles jointly with q given perfect matchings.