Results 1 - 10
of
251
A Classification of Consensus Methods for Phylogenetics
"... A consensus tree method takes a collection of phylogenetic trees and outputs a single "representative" tree. The first consensus method was proposed by Adams in 1972. Since then a large variety of different methods have been developed, and there has been considerable debate over how they should be u ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
A consensus tree method takes a collection of phylogenetic trees and outputs a single "representative" tree. The first consensus method was proposed by Adams in 1972. Since then a large variety of different methods have been developed, and there has been considerable debate over how they should be used. This paper has two goals...
Incomplete directed perfect phylogeny
- Siam Journal of Computing
, 2000
"... Abstract. Perfect phylogeny is one of the fundamental models for studying evolution. We investigate the following generalization of the problem: The input is a species-characters matrix. The characters are binary and directed, i.e., a species can only gain characters. The difference from standard pe ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Abstract. Perfect phylogeny is one of the fundamental models for studying evolution. We investigate the following generalization of the problem: The input is a species-characters matrix. The characters are binary and directed, i.e., a species can only gain characters. The difference from standard perfect phylogeny is that for some species the state of some characters is unknown. The question is whether one can complete the missing states in a way admitting a perfect phylogeny. The problem arises in classical phylogenetic studies, when some states are missing or undetermined. Quite recently, studies that infer phylogenies using inserted repeat elements in DNA gave rise to the same problem. The best known algorithm for the problem requires O(n2m) time for m characters and n species. We provide a near optimal ~O(nm)-time algorithm for the problem. 1 Introduction When studying evolution, the divergence patterns leading from a single ancestor species to its contemporary descendants are usually modeled by a tree structure. Extant species correspond to the tree leaves, while their common progenitor corresponds to the root of this phylogenetic tree. Internal nodes correspond to hypothetical ancient species, which putatively split up and evolved into distinct species. Tree branches model changes through time of the hypothetical ancestor species. The common case is that one has information regarding the leaves, from which the phylogenetic tree is to be inferred. This task, called phylogenetic reconstruction (cf. [7]), was one of the first algorithmic challenges posed by biology, and the computational community has been dealing with problems of this flavor for over three decades (see, e.g., [12]). In the character-based approach to tree reconstruction, contemporary species are described by their attributes or characters. Each character takes on one of several possible states. The input is represented by a matrix A where aij is the state of character j in species i, and the i-th row is the character vector of species i. The output sought is a hypothesis regarding evolution, i.e., a phylogenetic tree along with the suggested character-vectors of the internal nodes. This output must satisfy properties specified by the problem variant.
RIATA-HGT: A fast and accurate heuristic for reconstrucing horizontal gene transfer
- Proceedings of the Eleventh International Computing and Combinatorics Conference (COCOON 05). LNCS #3595
, 2005
"... Abstract. Horizontal gene transfer (HGT) plays a major role in microbial genome diversification, and is claimed to be rampant among various groups of genes in bacteria. Further, HGT is a major confounding factor for any attempt to reconstruct bacterial phylogenies. As a result, detecting and reconst ..."
Abstract
-
Cited by 14 (9 self)
- Add to MetaCart
Abstract. Horizontal gene transfer (HGT) plays a major role in microbial genome diversification, and is claimed to be rampant among various groups of genes in bacteria. Further, HGT is a major confounding factor for any attempt to reconstruct bacterial phylogenies. As a result, detecting and reconstructing HGT events in groups of organisms has become a major endeavor in biology. The problem of detecting HGT events based on incongruence between a species tree and a gene tree is computationally very hard (NP-hard). Efficient algorithms exist for solving restricted cases of the problem. We propose RIATA-HGT, the first polynomial-time heuristic to handle all HGT scenarios, without any restrictions. The method accurately infers HGT events based on analyzing incongruence among species and gene trees. Empirical performance of the method on synthetic and biological data is outstanding. Being a heuristic, RIATA-HGT may overestimate the optimal number of HGT events; empirical performance, however, shows that such overestimation is very mild. We have implemented our method and run it on biological and synthetic data. The results we obtained demonstrate very high accuracy of the method. Current version of RIATA-HGT uses the PAUP tool, and we are in the process of implementing a stand-alone version, with a graphical user interface, which will be made public. The tool, in its current implementation, is available from the authors upon request. 1
An Investigation of Phylogenetic Likelihood Methods
, 2003
"... We analyze the performance of likelihood-based approaches used to reconstruct phylogenetic trees. Unlike other techniques such as Neighbor-Joining (NJ) and Maximum Parsimony (MP), relatively little is known regarding the behavior of algorithms founded on the principle of likelihood. ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We analyze the performance of likelihood-based approaches used to reconstruct phylogenetic trees. Unlike other techniques such as Neighbor-Joining (NJ) and Maximum Parsimony (MP), relatively little is known regarding the behavior of algorithms founded on the principle of likelihood.
A linear-time majority tree algorithm
- In: Proc. 3rd Workshop Algs. in Bioinformatics (WABI’03
, 2003
"... Abstract. We give a randomized linear-time algorithm for computing themajorityruleconsensustree.Themajorityruletreeiswidelyused for summarizing a set of phylogenetic trees, which is usually a postprocessing step in constructing a phylogeny. We are implementing the algorithm as part of an interactive ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract. We give a randomized linear-time algorithm for computing themajorityruleconsensustree.Themajorityruletreeiswidelyused for summarizing a set of phylogenetic trees, which is usually a postprocessing step in constructing a phylogeny. We are implementing the algorithm as part of an interactive visualization system for exploring distributions of trees, where speed is a serious concern for real-time interaction. The linear running time is achieved by using succinct representation of the subtrees and efficient methods for the final tree reconstruction. 1
Grid-Flow: A grid-enabled scientific workflow system with a Petri-net-based interface
, 2006
"... ..."
PRec-I-DCM3: a parallel framework for fast and accurate large scale phylogeny reconstruction
- International Journal on Bioinformatics Research and Applications (IJBRA
, 2005
"... Accurate reconstruction of phylogenetic trees very often involves solving hard optimization problems, particularly the maximum parsimony (MP) and maximum likelihood (ML) problems. Various heuristics have been devised for solving these two problems; however, they obtain good results within reasonable ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Accurate reconstruction of phylogenetic trees very often involves solving hard optimization problems, particularly the maximum parsimony (MP) and maximum likelihood (ML) problems. Various heuristics have been devised for solving these two problems; however, they obtain good results within reasonable time only on small datasets. This has been a major impediment for large-scale phylogeny reconstruction, particularly for the effort to assemble the Tree of Life—the evolutionary relationship of all organisms on earth. Roshan et al. recently introduced Rec-I-DCM3, an efficient and accurate meta-method for solving the MP problem on large datasets of up to 14,000 taxa. Nonetheless, a drastic improvement in Rec-I-DCM3’s performance is still needed in order to achieve similar (or better) accuracy on datasets at the scale of the Tree of Life. In this paper, we improve the performance of Rec-I-DCM3 via parallelization. Experimental results demonstrate that our parallel
Partitioned Rendering Infrastructure for Stable Accordion Navigation
, 2005
"... My thesis presents a new rendering infrastructure for information visualization ap-plications that use the accordion drawing navigation metaphor. Accordion drawing techniques use rubber-sheet navigation methods, with the borders tacked down, and provide guaranteed visibility for marked areas of inte ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
My thesis presents a new rendering infrastructure for information visualization ap-plications that use the accordion drawing navigation metaphor. Accordion drawing techniques use rubber-sheet navigation methods, with the borders tacked down, and provide guaranteed visibility for marked areas of interest. Our accordion drawing algorithms are based on screen-space partitioning, which eliminates overculling and tightly bounds overdrawing. By eliminating the overculling effects of rendering dense regions of data, we guarantee a correct visual representation of any dataset. Also, our pixel-based drawing infrastructure improves the rendering performance of dense dataset regions with strict drawing constraints, which are based on application-specific drawing requirements. The generic infras-tructure provides an interface to numerically stable navigation of datasets, with full support for multiple concurrent regions of navigation motion. To evaluate our generic infrastructure, I benchmark our tree comparison application with the performance of TreeJuxtaposer, a previous accordion drawing application with identical features. I describe our tree traversal algorithms, which we use for efficient rendering, culling, and layout of tree datasets. I also discuss tree node marking techniques, which offer several improvements over previous range storage and retrieval techniques, reducing memory requirements and increasing rendering speed. Finally, I evaluate tree-specific navigation techniques from our winning entry in the InfoVis 2003 contest, with TreeJuxtaposer supported by an incremental search feature and an improved user interface.
Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition
, 2007
"... The electronic version of this article is the complete one and can be ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
The electronic version of this article is the complete one and can be
Picking fruit from the tree of life: Comments on taxanomic sampling and the quartet method
- In Proceedings of the 16th ACM Symposium on Applied Computing (2001), ACM
, 2001
"... A topic of recent interest and controversy in the field of systematic biology is the value of “taxonomic sampling”, the practice of adding additional sequences (taxa) to an analysis to improve the accuracy of the inferred evolutionary tree. In terms of tree inference algorithms that construct trees ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A topic of recent interest and controversy in the field of systematic biology is the value of “taxonomic sampling”, the practice of adding additional sequences (taxa) to an analysis to improve the accuracy of the inferred evolutionary tree. In terms of tree inference algorithms that construct trees from four taxa subtrees (quartet topologies), the value of taxonomic sampling can be rephrased as the question “are quartet topologies more accurately estimated when embedded within a larger set of taxa?”. Here we show that the answer to this question is negative, based on an analysis of nine 40 taxa trees with varying amounts of sequence divergence sampled from the Ribosomal Database Project. This result complements and contrasts previous research that examined the effects of taxonomic sampling on a single pathological quartet topology using artificially generated data. Our result is based on an experimental study using real data and examines the effect of taxonomic sampling on all quartet topologies induced by an evolutionary tree. 1.

