Results 1  10
of
28
Minimal Conflicting Sets for the Consecutive Ones Property in ancestral genome reconstruction
, 912
"... Abstract. A binary matrix has the Consecutive Ones Property (C1P) if its columns can be ordered in such a way that all 1’s on each row are consecutive. A Minimal Conflicting Set is a set of rows that does not have the C1P, but every proper subset has the C1P. Such submatrices have been considered in ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Abstract. A binary matrix has the Consecutive Ones Property (C1P) if its columns can be ordered in such a way that all 1’s on each row are consecutive. A Minimal Conflicting Set is a set of rows that does not have the C1P, but every proper subset has the C1P. Such submatrices have been considered in comparative genomics applications, but very little is known about their combinatorial structure and efficient algorithms to compute them. We first describe an algorithm that detects rows that belong to Minimal Conflicting Sets. This algorithm has a polynomial time complexity when the number of 1s in each row of the considered matrix is bounded by a constant. Next, we show that the problem of computing all Minimal Conflicting Sets can be reduced to the joint generation of all minimal true clauses and maximal false clauses for some monotone boolean function. We use these methods on simulated data related to ancestral genome reconstruction to show that computing Minimal Conflicting Set is useful in discriminating between true positive and false positive ancestral syntenies. We also study a dataset of yeast genomes and address the reliability of an ancestral genome proposal of the Saccahromycetaceae yeasts. Draft, do not distribute. Version of December 21, 2009. 1
Averagecase analysis of perfect sorting by reversals
, 2009
"... A sequence of reversals that takes a signed permutation to the identity is perfect if at no step a common interval is broken. Determining a parsimonious perfect sequence of reversals that sorts a signed permutation is NPhard. Here we show that, despite this worstcase analysis, with probability one ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
A sequence of reversals that takes a signed permutation to the identity is perfect if at no step a common interval is broken. Determining a parsimonious perfect sequence of reversals that sorts a signed permutation is NPhard. Here we show that, despite this worstcase analysis, with probability one, sorting can be done in polynomial time. Further, we find asymptotic expressions for the average length and number of reversals in commuting permutations, an interesting subclass of signed permutations. hal00354235, version 1 19 Jan 2009 1
Tractability Results for the ConsecutiveOnes Property with Multiplicity
"... Abstract. A binary matrix has the ConsecutiveOnes Property (C1P) if its columns can be ordered in such a way that all 1’s in each row are consecutive. We consider here a variant of the C1P where columns can appear multiple times in the ordering. Although the general problem of deciding the C1P with ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Abstract. A binary matrix has the ConsecutiveOnes Property (C1P) if its columns can be ordered in such a way that all 1’s in each row are consecutive. We consider here a variant of the C1P where columns can appear multiple times in the ordering. Although the general problem of deciding the C1P with multiplicity is NPcomplete, we present here a case of interest in comparative genomics that is tractable. 1
Evolution of Genome Organization by Duplication and Loss: an Alignment Approach
"... Abstract. We present a comparative genomics approach for inferring ancestral genome organization and evolutionary scenarios, based on a model accounting for contentmodifying operations. More precisely, we focus on comparing two ordered gene sequences with duplicated genes that have evolved from a c ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. We present a comparative genomics approach for inferring ancestral genome organization and evolutionary scenarios, based on a model accounting for contentmodifying operations. More precisely, we focus on comparing two ordered gene sequences with duplicated genes that have evolved from a common ancestor through duplications and losses; our model can be grouped in the class of “Block Edit ” models. From a combinatorial point of view, the main consequence is the possibility of formulating the problem as an alignment problem. On the other hand, in contrast to symmetrical metrics such as the inversion distance, duplications and losses are asymmetrical operations that are applicable to one of the two aligned sequences. Consequently, an ancestral genome can directly be inferred from a duplicationloss scenario attached to a given alignment. Although alignments are a priori simpler to handle than rearrangements, we show that a direct approach based on dynamic programming leads, at best, to an efficient heuristic. We present an exact pseudoboolean linear programming algorithm to search for the optimal alignment along with an optimal scenario of duplications and losses. Although exponential in the worst case, we show low running times on real datasets as well as synthetic data. We apply our algorithm in a phylogenetic context to the evolution of stable RNA (tRNA and rRNA) gene content and organization in Bacillus genomes. Our results lead to various biological insights, such as rates of ribosomal RNA proliferation among lineages, their role in altering tRNA gene content, and evidence of tRNA class conversion. List of Topics
On the Gapped Consecutive Ones Property
"... Abstract. Motivated by problems of comparative genomics and paleogenomics, we introduce the Gapped ConsecutiveOnes Property Problem (k,δ)C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k sequences of 1’s and no two consecu ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. Motivated by problems of comparative genomics and paleogenomics, we introduce the Gapped ConsecutiveOnes Property Problem (k,δ)C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k sequences of 1’s and no two consecutive sequences of 1’s are separated by a gap of more than δ 0’s. The classical C1P problem, which is known to be polynomial, is equivalent to the (1,0)C1P Problem. We show that the (2,δ)C1P Problem is NPcomplete for δ ≥ 2. We conjecture that the (k, δ)C1P Problem is NPcomplete for k ≥ 2, δ ≥ 1, (k, δ) ̸ = (2, 1). We also show that the (k,δ)C1P problem can be reduced to a graph bandwidth problem parameterized by a function of k, δ and of the maximum number s of 1’s in a row of M, and hence is polytime solvable if all three parameters are constant.
A faster algorithm for finding minimum Tucker submatrices
"... Abstract. A binary matrix has the Consecutive Ones Property (C1P) if its columns can be ordered in such a way that all 1s on each row are consecutive. Algorithmic issues of the C1P are central in computational molecular biology, in particular for physical mapping and ancestral genome reconstruction. ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract. A binary matrix has the Consecutive Ones Property (C1P) if its columns can be ordered in such a way that all 1s on each row are consecutive. Algorithmic issues of the C1P are central in computational molecular biology, in particular for physical mapping and ancestral genome reconstruction. In 1972, Tucker gave a characterization of matrices that have the C1P by a set of forbidden submatrices, and a substantial amount of research has been devoted to the problem of efficiently finding such a minimum size forbidden submatrix. This paper presents a new O( ∆ 3 m 2 (m ∆ + n 3)) time algorithm for this particular task for a m×n binary matrix with at most ∆ 1entries per row, thereby improving the O( ∆ 3 m 2 (mn + n 3)) time algorithm of Dom et al. [17]. 1
Insights into the structural evolution of amniote genomes
"... We investigate the problem of inferring contiguous ancestral regions (CARs) of the genome of the last common ancestor of all extant amniotes. We use the complete genome sequences and assemblies of 14 vertebrate species: 11 amniote genomes as ingroups and 3 teleost fish genomes as outgroups. We infer ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We investigate the problem of inferring contiguous ancestral regions (CARs) of the genome of the last common ancestor of all extant amniotes. We use the complete genome sequences and assemblies of 14 vertebrate species: 11 amniote genomes as ingroups and 3 teleost fish genomes as outgroups. We infer large regions or syntenies of three ancestral genomes: the amniote, therian and boreoeutherian ones. We then infer patterns of genome structural evolution in amniotes: types of rearrangements, long and short branches of the underlying phylogenetic tree, convergent and divergent karyotypic evolution. We encounter and explore several methodological issues: the construction of good conserved orthology blocks among all amniotes; the detection of conserved synteny signals between amniotes and teleost fishes based on the principle of Doubly Conserved Syntenies (DCS) used in (Jaillon et al. 2004) and taking into account the whole genome duplication in the teleost lineage; the detection of conserved contiguity and synteny signals between amniotes and the construction of large Contiguous Ancestral Regions (CARs) of ancestral genomes (as in Chauve and Tannier, PLoS Comput Biol 2008) and the linkage of CARs according to the DCS signal; the detection of reliable ancestral genome rearrangements (as in Zaho and Bourque, Genome Res 2009). The ancestral boreoeutherian genome we infer is in almost complete agreement with previous cytogenetics and computational studies. Therian and amniote ancestral genomes still miss good references, as two previous studies (Nakatani et al. Genome Res 2007, Kohn et al. Trends in Genetics 2004) gave divergent results. Still the amniote ancestral genome is found relatively close to the chicken genome in all studies, including this one. We analyse every karyotypic change in the chicken and therian branches. Input data (species tree and gene trees) �Pecan 12amniotesvertebrates multiple alignments from EnsemblCompara 54, �Gene trees from EnsemblCompara 54.
Prediction of Contiguous Regions in the Amniote Ancestral Genome
"... Abstract. We investigate the problem of inferring contiguous ancestral regions (CARs) of the genome of the last common ancestor of all extant amniotes, based on the currently sequenced and assembled amniote genomes as ingroups and three teleost fish genomes as outgroups. We combine a methodological ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. We investigate the problem of inferring contiguous ancestral regions (CARs) of the genome of the last common ancestor of all extant amniotes, based on the currently sequenced and assembled amniote genomes as ingroups and three teleost fish genomes as outgroups. We combine a methodological framework using conserved syntenies computed from whole genome alignments of amniote species together with double conserved syntenies (DCS) using gene families from amniote and fish genomes, to take into account the whole genome duplication that occurred in the teleost lineage. From these comparisons, ancestral genome segments are computed using techniques inspired by physical mapping. Due to the difficulty caused by the whole genome duplication and the large evolutionary distance to the closest assembled outgroup, very few methods have been published with a reconstruction of the amniote ancestral genome. This one is the first which is founded on a simple and formal methodological framework, whose good stability is shown and whose CARs cover large regions of the human and chicken genomes. 1
Hardness Results for the Gapped ConsecutiveOnes Property Problem
, 2009
"... Motivated by problems of comparative genomics and paleogenomics, in [6] the authors introduced the Gapped ConsecutiveOnes Property Problem (k, δ)C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k blocks of ones and no two ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Motivated by problems of comparative genomics and paleogenomics, in [6] the authors introduced the Gapped ConsecutiveOnes Property Problem (k, δ)C1P: given a binary matrix M and two integers k and δ, can the columns of M be permuted such that each row contains at most k blocks of ones and no two consecutive blocks of ones are separated by a gap of more than δ zeros. The classical C1P problem, which is known to be polynomial is equivalent to the (1, 0)C1P problem. They showed that the (2, δ)C1P Problem is NPcomplete for all δ ≥ 2 and that the (3, 1)C1P problem is NPcomplete. They also conjectured that the (k, δ)C1P Problem is NPcomplete for k ≥ 2, δ ≥ 1 and (k, δ)̸ = (2,1). Here, we prove that this conjecture is true. The only remaining case is the (2,1)C1P Problem, which could be polynomialtime solvable.
Breakpoint Distance and PQTrees
"... Abstract. The PQtree is a fundamental data structure that can encode large sets of permutations. It has recently been used in comparative genomics to model ancestral genomes with some uncertainty: given a phylogeny for some species, extant genomes are represented by permutations on the leaves of th ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. The PQtree is a fundamental data structure that can encode large sets of permutations. It has recently been used in comparative genomics to model ancestral genomes with some uncertainty: given a phylogeny for some species, extant genomes are represented by permutations on the leaves of the tree, and each internal node in the phylogenetic tree represents an extinct ancestral genome, represented by a PQtree. An open problem related to this approach is then to quantify the evolution between genomes represented by PQtrees. In this paper we present results for two problems of PQtree comparison motivated by this application. First, we show that the problem of comparing two PQtrees by computing the minimum breakpoint distance among all pairs of permutations generated respectively by the two considered PQtrees is NPcomplete for unsigned permutations. Next, we consider a generalization of the classical Breakpoint Median problem, where an ancestral genome is represented by a PQtree and p permutations are given, with p ≥ 1, and we want to compute a permutation generated by the PQtree that minimizes the sum of the breakpoint distances to the p permutations. We show that this problem is FixedParameter Tractable with respect to the breakpoint distance value. This last result applies both on signed and unsigned permutations, and to unichromosomal and multichromosomal permutations. 1