Results 1 - 10
of
310
Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world
, 2008
"... ..."
Orthologs, paralogs, and evolutionary genomics 1.
- Annual Review of Genetics
, 2005
"... Abstract Orthologs and paralogs are two fundamentally different types of homologous genes that evolved, respectively, by vertical descent from a single ancestral gene and by duplication. Orthology and paralogy are key concepts of evolutionary genomics. A clear distinction between orthologs and para ..."
Abstract
-
Cited by 112 (12 self)
- Add to MetaCart
(Show Context)
Abstract Orthologs and paralogs are two fundamentally different types of homologous genes that evolved, respectively, by vertical descent from a single ancestral gene and by duplication. Orthology and paralogy are key concepts of evolutionary genomics. A clear distinction between orthologs and paralogs is critical for the construction of a robust evolutionary classification of genes and reliable functional annotation of newly sequenced genomes. Genome comparisons show that orthologous relationships with genes from taxonomically distant species can be established for the majority of the genes from each sequenced genome. This review examines in depth the definitions and subtypes of orthologs and paralogs, outlines the principal methodological approaches employed for identification of orthology and paralogy, and considers evolutionary and functional implications of these concepts.
OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups
- Nucleic Acids Res
, 2006
"... of ortholog groups ..."
(Show Context)
Probabilistic model of the human proteinprotein interaction network." Nat Biotechnol 23(8
, 2005
"... A catalog of all human protein-protein interactions would provide scientists with a framework to study protein deregulation in complex diseases such as cancer. Here we demonstrate that a probabilistic analysis integrating model organism interactome data, protein domain data, genomewide gene express ..."
Abstract
-
Cited by 55 (0 self)
- Add to MetaCart
A catalog of all human protein-protein interactions would provide scientists with a framework to study protein deregulation in complex diseases such as cancer. Here we demonstrate that a probabilistic analysis integrating model organism interactome data, protein domain data, genomewide gene expression data and functional annotation data predicts nearly 40,000 protein-protein interactions in humans⎯a result comparable to those obtained with experimental and computational approaches in model organisms. We validated the accuracy of the predictive model on an independent test set of known interactions and also experimentally confirmed two predicted interactions relevant to human cancer, implicating uncharacterized proteins into definitive pathways. We also applied the human interactome network to cancer genomics data and identified several interaction subnetworks activated in cancer. This integrative analysis provides a comprehensive framework for exploring the human protein interaction network. We began by assembling a collection of genomic and proteomic data potentially useful in predicting human protein-protein interactions that included model organism protein-protein interactions 1 , protein domain assignments 2 , gene expression measurements in human tissue samples 3 and biological function annotations 4 ( A gold standard positive set (GSP) of 11,678 distinct protein-protein interactions among 5,505 proteins was queried from the Human Protein Reference Database (HPRD) 12 , a resource that contains known protein-protein interactions manually curated from the literature by expert biologists. A gold standard negative set (GSN) of 3,106,928 protein pairs was defined, in which one protein was assigned to the plasma membrane cellular component and the other to the nuclear cellular component by the Gene Ontology Consortium 4 . Although it is known that membrane proteins can occasionally interact with nuclear proteins, we demonstrated that there are far fewer known interactions within GSN than would be expected by chance Model organism protein-protein interactions From the Database of Interacting Proteins (DIP) 1 , we queried high-throughput interactome data from three model organisms: Sacchromyces cerevisiae
Pairwise alignment of protein interaction networks
- Journal of Computational Biology
, 2006
"... With an ever-increasing amount of available data on protein–protein interaction (PPI) networks and research revealing that these networks evolve at a modular level, discovery of conserved patterns in these networks becomes an important problem. Although available data on protein–protein interactions ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
(Show Context)
With an ever-increasing amount of available data on protein–protein interaction (PPI) networks and research revealing that these networks evolve at a modular level, discovery of conserved patterns in these networks becomes an important problem. Although available data on protein–protein interactions is currently limited, recently developed algorithms have been shown to convey novel biological insights through employment of elegant mathematical models. The main challenge in aligning PPI networks is to define a graph theoretical measure of similarity between graph structures that captures underlying biological phenomena accurately. In this respect, modeling of conservation and divergence of interactions, as well as the interpretation of resulting alignments, are important design parameters. In this paper, we develop a framework for comprehensive alignment of PPI networks, which is inspired by duplication/divergence models that focus on understanding the evolution of protein interactions. We propose a mathematical model that extends the concepts of match, mismatch, and gap in sequence alignment to that of match, mismatch, and duplication in network alignment and evaluates similarity between graph structures through a scoring function that accounts for evolutionary events. By relying on evolutionary models, the proposed framework facilitates interpretation of resulting alignments in terms of not only conservation but also divergence of modularity in PPI networks. Furthermore, as in the case of sequence alignment, our model allows flexibility in adjusting parameters to quantify underlying evolutionary relationships. Based on the proposed model, we formulate PPI network alignment as an optimization problem and present fast algorithms to solve this problem. Detailed experimental results from an implementation of the proposed framework show that our algorithm is able to discover conserved interaction patterns very effectively, in terms of both accuracies and computational cost. Key words: protein–protein interactions, network alignment, evolutionary models. 1.
Graph-based analysis and visualization of experimental results with ONDEX
- Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl081 ..."
Assignment of orthologous genes via genome rearrangement
- IEEE/ACM Transactions on Computational Biology and Bioinformatics
, 2005
"... Abstract—The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not cl ..."
Abstract
-
Cited by 47 (4 self)
- Add to MetaCart
Abstract—The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not clearly delineate the evolutionary relationship among genes of the same families. In this paper, we present a new approach to ortholog assignment that takes into account both sequence similarity and evolutionary events at a genome level, where orthologous genes are assumed to correspond to each other in the most parsimonious evolving scenario under genome rearrangement. First, the problem is formulated as that of computing the signed reversal distance with duplicates between the two genomes of interest. Then, the problem is decomposed into two new optimization problems, called minimum common partition and maximum cycle decomposition, for which efficient heuristic algorithms are given. Following this approach, we have implemented a highthroughput system for assigning orthologs on a genome scale, called SOAR, and tested it on both simulated data and real genome sequence data. Compared to a recent ortholog assignment method based entirely on homology search (called INPARANOID), SOAR shows a marginally better performance in terms of sensitivity on the real data set because it is able to identify several correct orthologous pairs that are missed by INPARANOID. The simulation results demonstrate that SOAR, in general, performs better than the iterated exemplar algorithm in terms of computing the reversal distance and assigning correct orthologs. Index Terms—Ortholog, paralog, gene duplication, genome rearrangement, reversal, comparative genomics. 1
Pairwise Local Alignment of Protein Interaction Networks Guided by Models of Evolution
- In RECOMB
, 2005
"... Abstract. With ever increasing amount of available data on proteinprotein interaction (PPI) networks and research revealing that these networks evolve at a modular level, discovery of conserved patterns in these networks becomes an important problem. Recent algorithms on aligning PPI networks target ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
(Show Context)
Abstract. With ever increasing amount of available data on proteinprotein interaction (PPI) networks and research revealing that these networks evolve at a modular level, discovery of conserved patterns in these networks becomes an important problem. Recent algorithms on aligning PPI networks target simplified structures such as conserved pathways to render these problems computationally tractable. However, since conserved structures that are parts of functional modules and protein complexes generally correspond to dense subnets of the network, algorithms that are able to extract conserved patterns in terms of general graphs are necessary. With this motivation, we focus here on discovering protein sets that induce subnets that are highly conserved in the interactome of a pair of species. For this purpose, we develop a framework that formally defines the pairwise local alignment problem for PPI networks, models the problem as a graph optimization problem, and presents fast algorithms for this problem. In order to capture the underlying biological processes correctly, we base our framework on duplication/divergence models that focus on understanding the evolution of PPI networks. Experimental results from an implementation of the proposed framework show that our algorithm is able to discover conserved interaction patterns very effectively (in terms of accuracies and computational cost). While we focus on pairwise local alignment of PPI networks in this paper, the proposed algorithm can be easily adapted to finding matches for a subnet query in a database of PPI networks. 1
Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana.
- Nat. Biotechnol.
, 2010
"... We introduce a rational approach for associating genes with plant traits by combined use of a genome-scale functional network and targeted reverse genetic screening. We present a probabilistic network (AraNet) of functional associations among 19,647 (73%) genes of the reference flowering plant Arab ..."
Abstract
-
Cited by 45 (10 self)
- Add to MetaCart
We introduce a rational approach for associating genes with plant traits by combined use of a genome-scale functional network and targeted reverse genetic screening. We present a probabilistic network (AraNet) of functional associations among 19,647 (73%) genes of the reference flowering plant Arabidopsis thaliana. AraNet associations are predictive for diverse biological pathways, and outperform predictions derived only from literature-based protein interactions, achieving 21% precision for 55% of genes. AraNet prioritizes genes for limited-scale functional screening, resulting in a hit-rate tenfold greater than screens of random insertional mutants, when applied to early seedling development as a test case. By interrogating network neighborhoods, we identify AT1G80710 (now DROUGHT SENSITIVE 1; DRS1) and AT3G05090 (now LATERAL ROOT STIMULATOR 1; LRS1) as regulators of drought sensitivity and lateral root development, respectively. AraNet (http://www.functionalnet.org/aranet/) provides a resource for plant gene function identification and genetic dissection of plant traits.