Results 1  10
of
156
Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions (Extended Abstract)
, 2002
"... The next highpriority phase of human genomics will involve the development of a full Haplotype Map of the human genome [12]. It will be used in largescale screens of populations to associate specific haplotypes with specific complex geneticinfluenced diseases. A prototype Haplotype Mapping strat ..."
Abstract

Cited by 126 (10 self)
 Add to MetaCart
The next highpriority phase of human genomics will involve the development of a full Haplotype Map of the human genome [12]. It will be used in largescale screens of populations to associate specific haplotypes with specific complex geneticinfluenced diseases. A prototype Haplotype Mapping strategy is presently being finalized by an NIH workinggroup. The biological key to that strategy is the surprising fact that genomic DNA can be partitioned into long blocks where genetic recombination has been rare, leading to strikingly fewer distinct haplotypes in the population than previously expected [12, 6, 21, 7]. In this paper
Optimal, efficient reconstruction of phylogenetic networks with constrained recombination
 J. Bioinformatics and Computational Biology
, 2003
"... gusfield,eddhu¡ A phylogenetic network is a generalization of a phylogenetic tree, allowing structural properties that are not treelike. With the growth of genomic data, much of which does not fit ideal tree models, there is greater need to understand the algorithmics and combinatorics of phylogenet ..."
Abstract

Cited by 115 (14 self)
 Add to MetaCart
(Show Context)
gusfield,eddhu¡ A phylogenetic network is a generalization of a phylogenetic tree, allowing structural properties that are not treelike. With the growth of genomic data, much of which does not fit ideal tree models, there is greater need to understand the algorithmics and combinatorics of phylogenetic networks [10, 11]. However, to date, very little has been published on this, with the notable exception of the paper by Wang et al.[12]. Other related papers include [4, 5, 7] We consider the problem introduced in [12], of determining whether the sequences can be derived on a phylogenetic network where the recombination cycles are node disjoint. In this paper, we call such a phylogenetic network a “galledtree”. By more deeply analysing the combinatorial constraints on cycledisjoint phylogenetic networks, we obtain an efficient algorithm that is guaranteed to be both a necessary and sufficient test for the existence of a galledtree for the data. If there is a galledtree, the algorithm constructs one and obtains an implicit representation of all the galled trees for the data, and can create these in linear time for each one. We also note two additional results related to galled trees: first, any set of sequences that can be derived on a galled tree can be derived on a true tree (without recombination cycles), where at most one back mutation is allowed per site; second, the site compatibility problem (which is NPhard in general) can be solved in linear time for any set of sequences that can be derived on a galled tree. The combinatorial constraints we develop apply (for the most part) to nodedisjoint cycles in any phylogenetic network (not just galledtrees), and can be used for example to prove that a given site cannot be on a nodedisjoint cycle in any phylogenetic network. Perhaps more important than the specific results about galledtrees, we introduce an approach that can be used to study recombination in phylogenetic networks that go beyond galledtrees.
Efficient reconstruction of haplotype structure via perfect phylogeny
 Journal of Bioinformatics and Computational Biology
, 2003
"... Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a h ..."
Abstract

Cited by 75 (12 self)
 Add to MetaCart
(Show Context)
Each person’s genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person’s genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population. Since experimental determination of a person’s genotype is less expensive than determining its component haplotypes, algorithms are required for computing haplotypes from genotypes. Two observations aid in this process: first, the human genome contains short blocks within which only a few different haplotypes occur; second, as suggested by Gusfield, it is reasonable to assume that the haplotypes observed within a block have evolved according to a perfect phylogeny, in which at most one mutation event has occurred at any site, and no recombination occurred at the given region. We present a simple and efficient polynomialtime algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny. Using a reduction to 2SAT we extend this algorithm to handle constraints that apply when we have genotypes from both parents and child. We also present a hardness result for the problem of removing the minimum number of individuals from a population to ensure that the genotypes of the remaining individuals are consistent with a perfect phylogeny. Our algorithms have been tested on real data and give biologically meaningful results. Our webserver
Parameterized Computational Feasibility
 Feasible Mathematics II
, 1994
"... Many natural computational problems have input consisting of two or more parts. For example, the input might consist of a graph and a positive integer. For many natural problems we may view one of the inputs as a parameter and study how the complexity of the problem varies if the parameter is he ..."
Abstract

Cited by 66 (21 self)
 Add to MetaCart
Many natural computational problems have input consisting of two or more parts. For example, the input might consist of a graph and a positive integer. For many natural problems we may view one of the inputs as a parameter and study how the complexity of the problem varies if the parameter is held fixed. For many applications of computational problems involving such a parameter, only a small range of parameter values is of practical significance, so that fixedparameter complexity is a natural concern. In studying the complexity of such problems, it is therefore important to have a framework in which we can make qualitative distinctions about the contribution of the parameter to the complexity of the problem. In this paper we survey one such framework for investigating parameterized computational complexity and present a number of new results for this theory.
A PolynomialTime Algorithm for the Perfect Phylogeny Problem when the Number of Character States Is Fixed
 SIAM JOURNAL ON COMPUTING
, 1994
"... We present a polynomialtime algorithm for determining whether a set of species, described by the characters they exhibit, has a perfect phylogeny, assuming the maximum number of possible states for a character is fixed. This solves a longstanding open problem. Our result should be contrasted with ..."
Abstract

Cited by 61 (4 self)
 Add to MetaCart
We present a polynomialtime algorithm for determining whether a set of species, described by the characters they exhibit, has a perfect phylogeny, assuming the maximum number of possible states for a character is fixed. This solves a longstanding open problem. Our result should be contrasted with the proof by Steel and Bodlaender, Fellows, and Warnow that the perfect phylogeny problem is NPcomplete in general.
Reconstructing reticulate evolution in species  theory and practice
 In Proc. of 8’th Annual International Conference on Computational Molecular Biology
, 2004
"... We present new methods for reconstructing reticulate evolution of species due to events such as horizontal transfer or hybrid speciation; both methods are based upon extensions of Wayne Maddison’s approach in his seminal 1997 paper. Our first method is a polynomial time algorithm for constructing ph ..."
Abstract

Cited by 58 (7 self)
 Add to MetaCart
(Show Context)
We present new methods for reconstructing reticulate evolution of species due to events such as horizontal transfer or hybrid speciation; both methods are based upon extensions of Wayne Maddison’s approach in his seminal 1997 paper. Our first method is a polynomial time algorithm for constructing phylogenetic networks from two gene trees contained inside the network. We allow the network to have an arbitrary number of reticulations, but we limit the reticulation in the network so that the cycles in network are nodedisjoint (“galled”). Our second method is a polynomial time algorithm for constructing networks with one reticulation, where we allow for errors in the estimated gene trees. Using simulations, we demonstrate improved performance of this method over both NeighborNet and Maddison’s method. 1
A Fast Algorithm for the Computation and Enumeration of Perfect Phylogenies
 SIAM JOURNAL ON COMPUTING
, 1995
"... The Perfect Phylogeny Problem is a classical problem in computational evolutionary biology, in which a set of species/taxa is described by a set of qualitative characters. In recent years, the problem has been shown to be NPComplete in general, while the different fixed parameter versions can e ..."
Abstract

Cited by 49 (8 self)
 Add to MetaCart
The Perfect Phylogeny Problem is a classical problem in computational evolutionary biology, in which a set of species/taxa is described by a set of qualitative characters. In recent years, the problem has been shown to be NPComplete in general, while the different fixed parameter versions can each be solved in polynomial time. In particular, Agarwala and FernandezBaca have developed an O(2 3r (nk 3 +k 4 )) algorithm for the perfect phylogeny problem for n species defined by k rstate characters. Since commonly the character data is drawn from alignments of molecular sequences, k is the length of the sequences and can thus be very large (in the hundreds or thousands). Thus, it is imperative to develop algorithms which run efficiently for large values of k. In this paper we make additional observations about the structure of the problem and produce an algorithm for the problem that runs in time O(2 2r k 2 n). We also show how it is possible to efficiently build a...
Constructing a Tree from Homeomorphic Subtrees, with Applications to Computational Evolutionary Biology
"... We are given a set T = fT1 ; T2 ; : : : ; Tkg of rooted binary trees, each T i leaflabeled by a subset L(T i ) ae f1; 2; : : : ; ng. If T is a tree on f1; 2; : : : ; ng, we let TjL denote the minimal subtree of T induced by the nodes of L and all their ancestors. The consensus tree problem asks wh ..."
Abstract

Cited by 48 (3 self)
 Add to MetaCart
(Show Context)
We are given a set T = fT1 ; T2 ; : : : ; Tkg of rooted binary trees, each T i leaflabeled by a subset L(T i ) ae f1; 2; : : : ; ng. If T is a tree on f1; 2; : : : ; ng, we let TjL denote the minimal subtree of T induced by the nodes of L and all their ancestors. The consensus tree problem asks whether there exists a tree T such that for every i, T jL(T i ) is homeomorphic to T i . We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time minfO(Nn 1=2 ); O(N + n 2 log n)g, where N = P i jT i j, and uses linear space. The randomized algorithm takes time O(N log 3 n) and uses linear space. The previous best for this problem was an 1981 O(Nn) algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph G with n nodes and m edges and a sequence of b batches of one or more edge deletions, then a...
Assessment of the accuracy of matrix representation with parsimony analysis supertree construction. Syst. Biol
, 2001
"... Abstract.—Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, largescale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, mat ..."
Abstract

Cited by 48 (5 self)
 Add to MetaCart
Abstract.—Despite the growing popularity of supertree construction for combining phylogenetic information to produce more inclusive phylogenies, largescale performance testing of this method has not been done. Through simulation, we tested the accuracy of the most widely used supertree method, matrix representation with parsimony analysis (MRP), with respect to a (maximum parsimony) total evidence solution and a known model tree. When source trees overlap completely, MRP provided a reasonable approximation of the total evidence tree; agreement was usually>85%. Performance improved slightly when using smaller, more numerous, or more congruent source trees, and especially when elements were weighted in proportion to the bootstrap frequencies of the nodes they represented on each source tree (“weighted MRP”). Although total evidence always estimated the model tree slightly better than nonweighted MRP methods, weighted MRP in turn usually outperformed total evidence slightly. When source studies were even moderately nonoverlapping (i.e., sharing only threequarters of the taxa), the high proportion of missing data caused a loss in resolution that severely degraded the performance for all methods, including total evidence. In such cases, even combining more trees, which had positive effects elsewhere, did not improve accuracy. Instead, “seeding ” the supertree or total evidence analyses with a single largely complete study improved
A Fundamental Decomposition Theory for Phylogenetic Networks and Incompatible Characters
 In proc Research in Computational Molecular Biology
, 2005
"... ..."