Results 1  10
of
37
Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions (Extended Abstract)
, 2002
"... The next highpriority phase of human genomics will involve the development of a full Haplotype Map of the human genome [12]. It will be used in largescale screens of populations to associate specific haplotypes with specific complex geneticinfluenced diseases. A prototype Haplotype Mapping strat ..."
Abstract

Cited by 126 (10 self)
 Add to MetaCart
The next highpriority phase of human genomics will involve the development of a full Haplotype Map of the human genome [12]. It will be used in largescale screens of populations to associate specific haplotypes with specific complex geneticinfluenced diseases. A prototype Haplotype Mapping strategy is presently being finalized by an NIH workinggroup. The biological key to that strategy is the surprising fact that genomic DNA can be partitioned into long blocks where genetic recombination has been rare, leading to strikingly fewer distinct haplotypes in the population than previously expected [12, 6, 21, 7]. In this paper
A PolynomialTime Algorithm for the Perfect Phylogeny Problem when the Number of Character States Is Fixed
 SIAM JOURNAL ON COMPUTING
, 1994
"... We present a polynomialtime algorithm for determining whether a set of species, described by the characters they exhibit, has a perfect phylogeny, assuming the maximum number of possible states for a character is fixed. This solves a longstanding open problem. Our result should be contrasted with ..."
Abstract

Cited by 61 (4 self)
 Add to MetaCart
We present a polynomialtime algorithm for determining whether a set of species, described by the characters they exhibit, has a perfect phylogeny, assuming the maximum number of possible states for a character is fixed. This solves a longstanding open problem. Our result should be contrasted with the proof by Steel and Bodlaender, Fellows, and Warnow that the perfect phylogeny problem is NPcomplete in general.
A Fast Algorithm for the Computation and Enumeration of Perfect Phylogenies
 SIAM JOURNAL ON COMPUTING
, 1995
"... The Perfect Phylogeny Problem is a classical problem in computational evolutionary biology, in which a set of species/taxa is described by a set of qualitative characters. In recent years, the problem has been shown to be NPComplete in general, while the different fixed parameter versions can e ..."
Abstract

Cited by 49 (8 self)
 Add to MetaCart
The Perfect Phylogeny Problem is a classical problem in computational evolutionary biology, in which a set of species/taxa is described by a set of qualitative characters. In recent years, the problem has been shown to be NPComplete in general, while the different fixed parameter versions can each be solved in polynomial time. In particular, Agarwala and FernandezBaca have developed an O(2 3r (nk 3 +k 4 )) algorithm for the perfect phylogeny problem for n species defined by k rstate characters. Since commonly the character data is drawn from alignments of molecular sequences, k is the length of the sequences and can thus be very large (in the hundreds or thousands). Thus, it is imperative to develop algorithms which run efficiently for large values of k. In this paper we make additional observations about the structure of the problem and produce an algorithm for the problem that runs in time O(2 2r k 2 n). We also show how it is possible to efficiently build a...
Reconstructing a History of Recombinations From a Set of Sequences
 Discrete Appl. Math
, 1998
"... One of the classic problems in computational biology is the reconstruction of evolutionary history. A recent trend in the area is to increase the explanatory power of the models that are considered by incorporating higherorder evolutionary events that more accurately reflect the mechanisms of mutat ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
(Show Context)
One of the classic problems in computational biology is the reconstruction of evolutionary history. A recent trend in the area is to increase the explanatory power of the models that are considered by incorporating higherorder evolutionary events that more accurately reflect the mechanisms of mutation at the level of the chromosome. We take a step in this direction by considering the problem of reconstructing an evolutionary history for a set of genetic sequences that have evolved by recombination. Recombination is a nontreelike event that produces a child sequence by crossing two parent sequences. We present polynomialtime algorithms for reconstructing a parsimonious history of such events for several models of recombination when all sequences, including those of ancestors, are present in the input. We also show that these models appear to be near the limit of what can be solved in polynomial time, in that several natural generalizations are NPcomplete. Keywords Computational bio...
Fast and simple algorithms for perfect phylogeny and triangulating colored graphs
 INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE
, 1996
"... ..."
Efficient approximation of convex recolorings
 In Proceedings APPROX 2005: 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, published with Proceedings RANDOM 2005
, 2005
"... A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring of trees arises in areas such as phylogenetics, lingui ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring of trees arises in areas such as phylogenetics, linguistics, etc. e.g., a perfect phylogenetic tree is one in which the states of each character induce a convex coloring of the tree. Research on perfect phylogeny is usually focused on finding a tree so that few predetermined partial colorings of its vertices are convex. When a coloring of a tree is not convex, it is desirable to know ”how far ” it is from a convex one. In [18], a natural measure for this distance, called the recoloring distance was defined: the minimal number of color changes at the vertices needed to make the coloring convex. This can be viewed as minimizing the number of “exceptional vertices ” w.r.t. to a closest convex coloring. The problem was proved to be NPhard even for colored strings. In this paper we continue the work of [18], and present a 2approximation algorithm of convex recoloring of strings whose running time O(cn), where c is the number of colors and n is the size of the input, and an O(cn 2) 3approximation algorithm for convex recoloring of trees. ∗ A preliminary version of the results in this paper appeared in [19].
Convex recolorings of strings and trees: Definitions, hardness results, and algorithms
 Proceedings WADS 2005: 9th International Workshop on Algorithms and Data Structures
, 2005
"... A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex colorings of trees arise in areas such as phylogenetics, lingui ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
(Show Context)
A coloring of a tree is convex if the vertices that pertain to any color induce a connected subtree; a partial coloring (which assigns colors to some of the vertices) is convex if it can be completed to a convex (total) coloring. Convex colorings of trees arise in areas such as phylogenetics, linguistics, etc. e.g., a perfect phylogenetic tree is one in which the states of each character induce a convex coloring of the tree. When a coloring of a tree is not convex, it is desirable to know ”how far ” it is from a convex one, and what are the convex colorings which are ”closest ” to it. In this paper we study a natural definition of this distance the recoloring distance, which is the minimal number of color changes at the vertices needed to make the coloring convex. We show that finding this distance is NPhard even for a colored string (a path), and for some other interesting variants of the problem. In the positive side, we present algorithms for computing the recoloring distance under some natural generalizations of this concept: the first generalization is the uniform weighted model, where each vertex has a weight which is the cost of changing its color. The other is the nonuniform model, in which the cost of coloring a vertex v by a color d is an arbitrary nonnegative number cost(v, d). Our first algorithms find optimal convex recolorings of strings and bounded degree trees under the nonuniform model in time which, for any fixed number of colors, is linear in the input size. Next we improve these algorithm for the uniform model to run in time which is linear in the input size for a fixed number of bad colors, which are colors which violate convexity in some natural sense. Finally, we generalize the above result to hold for trees of unbounded degree. ∗ A preliminary version of some of the results in this paper appeared in [17].
The Hardness of Perfect Phylogeny, Feasible Register Assignment and Other Problems on Thin Colored Graphs
"... In this paper, we consider the complexity of a number of combinatorial problems; namely, Intervalizing Colored Graphs (DNA physical mapping), Triangulating Colored Graphs (perfect phylogeny), (Directed) (Modified) Colored Cutwidth, Feasible Register Assignment and Module Allocation for graphs of bou ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
In this paper, we consider the complexity of a number of combinatorial problems; namely, Intervalizing Colored Graphs (DNA physical mapping), Triangulating Colored Graphs (perfect phylogeny), (Directed) (Modified) Colored Cutwidth, Feasible Register Assignment and Module Allocation for graphs of bounded pathwidth. Each of these problems has as a characteristic a uniform upper bound on the tree or path width of the graphs in "yes"instances. For all of these problems with the exceptions of Feasible Register Assignment and Module Allocation, a vertex or edge coloring is given as part of the input. Our main results are that the parameterized variant of each of the considered problems is hard for the complexity classes W [t] for all t 2 N. We also show that Intervalizing Colored Graphs, Triangulating Colored Graphs, and Colored Cutwidth are NPComplete. 1 Introduction This paper focuses on a number of graph decision problems which share the characteristic that all have a uniform upper bo...
Reconstructing the evolutionary history of natural languages
, 1995
"... In this paper we present a new methodology for determining the evolutionary history of related languages. Our methodology uses linguistic information encoded as qualitative characters, so that prospective trees can be evaluated according to various optimization criteria, much as is done in the prac ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
(Show Context)
In this paper we present a new methodology for determining the evolutionary history of related languages. Our methodology uses linguistic information encoded as qualitative characters, so that prospective trees can be evaluated according to various optimization criteria, much as is done in the practice of inferring evolutionary history for biological species. By contrast with biology, however, we find that the linguistic data support evolutionary trees with extremely good compatibility scores, and that for such data it is possible to find optimal trees quickly. We have applied this method to the classification of IndoEuropean (IE) languages; we havebeen able to resolve one longstanding open problem (the IndoHittite hypothesis), and have indicated exactly what needs to be established in order to resolve another longstanding open problem (the ItaloCeltic hypothesis). We have also discovered rather surprising facts about the history of Germanic within this family. Thus, this method provides an ability to resolve di cult questions in Historical Linguistics that have proved resistent to traditional characterbased methodologies and to the more recent distance based approaches of lexicostatistics. The results of our methodology also indicate weaknesses in methods currently accepted and practiced in historical linguistics. One of our more important results is the ability to detect and handle loan words that are not distinguishable from true cognates by traditional methods. Finally, this methodology permits the linguist to develop and test assumptions about the evolutionary relevance of di erent linguistic characters.