Results 1 - 10
of
118
Accomplishments and Challenges in Literature Data Mining for Biology
, 2002
"... We review recent results in literature data mining for biology and discuss the need and the steps for a challenge evaluation for this field. Literature data mining has progressed from simple recognition of terms to extraction of interaction relationships from complex sentences, and has broadened fro ..."
Abstract
-
Cited by 119 (8 self)
- Add to MetaCart
We review recent results in literature data mining for biology and discuss the need and the steps for a challenge evaluation for this field. Literature data mining has progressed from simple recognition of terms to extraction of interaction relationships from complex sentences, and has broadened from recognition of protein interactions to arange of problems such as improving homology search, identifying cellular location, and so on. To encourage participation and accelerate progress in this expanding field, we propose creating challenge evaluations, and we describe two specific applications in this context.
MIPS: analysis and annotation of proteins from whole genomes
- Nucleic Acids Res
, 2004
"... resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of.400 genomes automatically annotated w ..."
Abstract
-
Cited by 100 (8 self)
- Add to MetaCart
resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of.400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein– protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server
Mining the Biomedical Literature in the Genomic Era: An Overview
- JOURNAL OF COMPUTATIONAL BIOLOGY
, 2003
"... The past decade has seen a tremendous growth in the amount of experimental and computational biomedical data, specifically in the areas of Genomics and Proteomics. This growth is accompanied by an accelerated increase in the number of biomedical publications discussing the findings. In the last f ..."
Abstract
-
Cited by 75 (2 self)
- Add to MetaCart
The past decade has seen a tremendous growth in the amount of experimental and computational biomedical data, specifically in the areas of Genomics and Proteomics. This growth is accompanied by an accelerated increase in the number of biomedical publications discussing the findings. In the last few years there is a lot of interest within the scientific community in literature-mining tools to help sort through this abundance of literature, and find the nuggets of information most relevant and useful for specific analysis tasks. This paper
Taverna: a tool for building and running workflows of services
- Nucleic Acids Res
, 2006
"... Taverna is an application that eases the use and integration of the growing number of molecular biology tools and databases available on the web, especially web services. It allows bioinformaticians to construct workflows or pipelines of services to perform a range of different analyses, such as seq ..."
Abstract
-
Cited by 71 (4 self)
- Add to MetaCart
Taverna is an application that eases the use and integration of the growing number of molecular biology tools and databases available on the web, especially web services. It allows bioinformaticians to construct workflows or pipelines of services to perform a range of different analyses, such as sequence analysis and genome annotation. These high-level workflows can integrate many different resources into a single analysis. Taverna is available freely under the terms of the GNU Lesser General Public License (LGPL) from
The European Bioinformatics Institute’s data resources: towards systems biology
- Nucleic Acids Res
, 2005
"... Genomic and post-genomic biological research has provided fine-grain insights into the molecular processes of life, but also threatens to drown biomedical researchers in data. Moreover, as new high-throughput technologies are developed, the types of data that are gathered en masse are diversifying. ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
Genomic and post-genomic biological research has provided fine-grain insights into the molecular processes of life, but also threatens to drown biomedical researchers in data. Moreover, as new high-throughput technologies are developed, the types of data that are gathered en masse are diversifying. The need to collect, store and curate all this information in ways that allow its efficient retrieval and exploitation is greater than ever. The European Bioinformatics Institute’s (EBI’s) databases and tools have evolved to meet the changing needs of molecular biologists: sincewelastwroteaboutourservicesinthe2003issue of Nucleic Acids Research, we have launched new databases covering protein–protein interactions (IntAct), pathways (Reactome) and small molecules (ChEBI). Our existing core databases have continued to evolve to meet the changing needs of biomedical researchers, and we have developed new data-access tools that help biologists to move intuitively through the different data types, thereby helping them to put the parts together to understand biology at the systems level. The EBI’s data resources are all available on our website at
Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks
- In Proceedings of the IEEE Computer Society Bioinformatics Conference (CSB 03
, 2003
"... We propose a statistical method for estimating a gene network based on Bayesian networks from microarray gene expression data together with biological knowledge including protein-protein interactions, protein-DNA interactions, binding site information, existing literature and so on. Unfortunately, m ..."
Abstract
-
Cited by 38 (4 self)
- Add to MetaCart
We propose a statistical method for estimating a gene network based on Bayesian networks from microarray gene expression data together with biological knowledge including protein-protein interactions, protein-DNA interactions, binding site information, existing literature and so on. Unfortunately, microarray data do not contain enough information for constructing gene networks accurately in many cases. Our method adds biological knowledge to the estimation method of gene networks under a Bayesian statistical framework, and also controls the trade-off between microarray information and biological knowledge automatically. We conduct Monte Carlo simulations to show the effectiveness of the proposed method. We analyze Saccharomyces cerevisiae gene expression data as an application. 1.
ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins
- Nucleic Acids Res
, 2003
"... Multidomain proteins predominate in eukaryotic proteomes. Individual functions assigned to different sequence segments combine to create a complex function for the whole protein. While on-line resources are available for revealing globular domains in sequences, there has hitherto been no comprehensi ..."
Abstract
-
Cited by 37 (6 self)
- Add to MetaCart
Multidomain proteins predominate in eukaryotic proteomes. Individual functions assigned to different sequence segments combine to create a complex function for the whole protein. While on-line resources are available for revealing globular domains in sequences, there has hitherto been no comprehensive collection of small functional sites/ motifs comparable to the globular domain resources, yet these are as important for the function of multidomain proteins. Short linear peptide motifs are used for cell compartment targeting, protein–protein interaction, regulation by phosphorylation, acetylation, glycosylation and a host of other post-translational modifications. ELM, the Eukaryotic Linear Motif server at
Predicting protein complex membership using probabilistic network reliability
- Genome Res
, 2004
"... data ..."
STRING: known and predicted protein-protein associations, integrated and transferred across organisms
- Database Issue
, 2005
"... associations, integrated and transferred across organisms ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
associations, integrated and transferred across organisms
An overview of data models for the analysis of biochemical pathways
- Briefings in Bioinformatics
, 2003
"... Abstract. Various forms of data models can be used for the analysis of biochemical pathways such as metabolic, regulatory, or signal transduction pathways. This paper overviews and classifies the different forms of data models found in the literature, and describes how these models have been used in ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
Abstract. Various forms of data models can be used for the analysis of biochemical pathways such as metabolic, regulatory, or signal transduction pathways. This paper overviews and classifies the different forms of data models found in the literature, and describes how these models have been used in the analysis of biochemical pathways. The quantity of available information on biochemical pathways for different organisms is increasing very rapidly, and it has now become possible to perform detailed analyses of metabolic pathway structures for entire organisms. However, such analyses face difficulties due to the nature of the databases which are often heterogeneous, incomplete, or inconsistent. This makes pathway analysis a challenging problem in system biology and in bioinformatics. In this overview, we concentrate on models of network structure, focusing on the analysis of existing information, collected from experiments and stored in databases. We overview and classify the different forms of data models found in the literature using a unified framework. We describe how these models have been used in the analysis of biochemical pathways. This enables us to underline the strengths and weaknesses of the different approaches, and at the same time highlights some relevant future research directions.

