DMCA
Conference Review From databases to modelling of functional pathways (2004)
BibTeX
@MISC{Nasi04conferencereview,
author = {Sergio Nasi and Sergio Nasi},
title = {Conference Review From databases to modelling of functional pathways},
year = {2004}
}
OpenURL
Abstract
Abstract This short review comments on current informatics resources and methodologies in the study of functional pathways in cell biology. It highlights recent achievements in unveiling the structural design of protein and gene networks and discusses current approaches to model and simulate the dynamics of regulatory pathways in the cell. Understanding how genes interact to perform specific biological processes is a major challenge in biology. It is felt that this is becoming possible due to the large amount of information generated by genomic sequencing, protein interaction and gene expression studies, and stored in public databases (www.ncbi.nlm.nih.gov/GenBank; www.ncbi.nlm.nih.gov/LocusLink; www.ncbi.nlm.nih.gov/UniGene; http://us.expasy.org/sprot; www.ensembl.org; www.ebi.ac.uk; www.yeastgenome.org/; www.arabidopsis.org/; www.wormbase.org/; http://flybase.bio.indiana.edu/; www.informatics.jax.org; http://rgd.mcw.edu/; http://genome-www5.stanford.edu/MicroArray/ SMD/). Achieving this objective will require data to be organized in a more understandable structure. Data representation in the form of networks or functional pathways, and modelling their dynamic behaviour, is expected to give a better insight into the complex patterns of gene-protein interactions. At the same time, such models are expected to revolutionize drug screening, and the identification of functional pathways involved in pathogenesis will facilitate the rational design of therapies Databases The effort of creating biological pathway databases and providing informatics tools for their analysis has been undertaken by public and private initiatives, such as Transpath (www.biobase.de), Biocarta (www.biocarta.com), GenMAPP (www.genmapp.org), aMaze (www.amaze.ulb.ac.be) and the Alliance for Cellular Signaling (AfCS:www.afcs.org). The AfCS consortium, which is presently focused on lymphocyte and cardiac myocyte signalling, has the overall goal to understand the relationships between sets of inputs and outputs that vary both temporally and spatially. This will involve identification of all the proteins that comprise the various signalling systems, the assessment of information flow in both normal and pathological states, and the reduction of the data into a set of theoretical models. The aMaze project of an omni-comprehensive, object-orientated data model is implemented in both MySQL and Oracle languages. It aims at representing functional and physical interactions among biochemical entities mapped onto their cellular and tissue locations. It also attempts to provide a workbench for analysing networks of cellular processes, such as metabolic pathways, protein-protein interactions, gene regulation, transport and signal transduction. Most of the pathway data presently stored in the database relate to yeast and bacterial cells. A complication in pathway analysis results from network component compartmentalization in space and time, both at the cellular level Gene and protein network architecture Gene or protein networks are more easily understood when represented as graphs, in which nodes are genes or proteins, and arcs (edges) are relationships between nodes. Depending on the case, edges can have direction and weight. Data from highthroughput protein interaction screens and DNA microarray experiments, as well as tools for mining information in the scientific literature, have supported the elucidation of the structural design of networks, an important step towards modelling and understanding cellular control systems. By employing controlled vocabularies (www.geneontology. org) linked to gene symbols, it is possible to mine qualitative information: automatic query methods have been used to extract and structure knowledge from publicly available gene/protein and text databases. This allows the creation of a cocitation network Due to their importance in cell physiology, considerable efforts are being devoted to large-scale mapping of protein interaction networks by yeast two hybrid screens jsp). Microarray data analysis presents the challenge of revealing functional patterns in the chaos that is gene expression. The starting point is a gene expression data matrix, utilized by clustering algorithms to identify co-expressed genes, which are thought to be regulated by shared transcription factors (http://genexpress.stanford.edu). Although powerful for organizing data, such algorithms, by themselves, are unfit for model building since they do not relate gene expression values to a given functional state. Graph theory, supervised learning and other statistical and computational approaches have been adopted to make predictions and to reconstruct gene regulation networks from microarray data Although it might be possible in principle, network reconstruction based solely on microarray experiments proved very hard to achieve, pointing to the utility of incorporating information on transcription factor binding to gene promoters ([31] www.math.uah.edu/stat). Interaction between transcription factors and their DNA binding sites may be deduced from computational analysis of binding sites in promoter sequences Methods have been devised to extract regulatory information from binding data and to find synergistic motif combinations in the promoters of co-regulated genes ([19,22,23] http://web.wi.mit. edu/young/regulator network). More advanced methods, such as the genetic regulatory modules (GRAM Both protein and gene interaction networks appear to be scale-free, the connectivity of their nodes following a power law; therefore, they have small world properties like many other networks found in nature. Such global views, although fascinating, do not always appear of immediate utility for biologists, since they give only a general impression of the network operation and lack crucial details 182 S. Nasi Modelling of cellular pathways Depicting sets of molecular interactions as static graphs does not reveal the dynamics of events within cells. The myriad of data now available has stimulated attempts to design a computer replica of a living cell, by including everything that is known in one description of an entire cell biological network. Several projects aim to develop theoretical supports, technologies and software platforms for whole cell simulation.