by
Bmc Bioinformatics
,
Ian Donaldson
,
Joel Martin
,
Berry De Bruijn
,
Cheryl Wolting
,
Brigitte Tuekam
,
Shudong Zhang
,
Berivan Baskin
,
Gary D Bader
,
Vicki Lay
,
Katerina Michalickova
,
Tony Pawson
,
Christopher Wv Hogue
Add To MetaCart
Abstract:
Background: The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND.
Citations
|
347
|
Inductive learning algorithms and representations for text categorization
– Dumais, Platt, et al.
- 1998
|
|
152
|
Hovig E: A literature network of human genes for high-throughput analysis of gene expression
– TK, Laegreid, et al.
|
|
142
|
The SMART Retrieval System
– Salton
- 1971
|
|
32
|
Estreicher A, Gasteiger E
– Boeckmann, Bairoch, et al.
|
|
28
|
Yang L, Wolting C, Donaldson I, Schandorff
– Ho, Gruhler, et al.
|
|
21
|
Krauthammer M, Rzhetsky A: GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics 2001, 17(Suppl 1):S74-S82
– Friedman, Kra, et al.
|
|
16
|
The potential use of SUISEKI as a protein interaction discovery tool Genome Inform
– Blaschke, Valencia
|
|
14
|
Biobibliometrics: information retrieval and visualization from co- occurrences of gene names in Medline abstracts. Pac Symp Biocomput 2000:529–40
– BJ, Benoit
- 2000
|
|
14
|
MIPS: a database for genomes and protein sequences. Nucleic Acids Res
– HW, Heumann, et al.
- 1999
|
|
13
|
Hishigaki H, Tanigami A, Takagi T: Automated extraction of information on protein-protein interactions from the biological literature
– Ono
|
|
13
|
Xenarios I, Eisenberg D: Mining literature for protein-protein interactions
– EM
|
|
13
|
Takagi T: Toward information extraction: identifying protein names from biological papers
– Fukuda, Tsunoda, et al.
|
|
13
|
Hogue CW: Analyzing yeast protein-protein interaction data obtained from different sources
– GD
|
|
10
|
Wolting C, Ouellette BF, Pawson T, Hogue CW: BIND-The Biomolecular Interaction Network Database
– GD, Donaldson
|
|
8
|
Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO
– SS, MA, et al.
|
|
6
|
Tsujii J: Identifying the Interaction between Genes and Gene
– Sekimizu, HS
- 1998
|
|
6
|
Gaizauskas R: Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures. Pac Symp Biocomput
– Humphreys, Demetriou
|
|
4
|
Ouzounis C, Pulman S, Carroll M: Automatic Extraction
– Thomas, Milward
|
|
3
|
Wolting C, and Donaldson I: Extracting sentences to justify categorization
– Bruijn, Martin
|
|
2
|
Hogue CW BIND--a data specification for storing and describing biomolecular interactions, molecular complexes and pathways Bioinformatics 2000
– GD
|
|
2
|
a protein interaction extraction system. Pac Symp Biocomput 2001;1:520–31
– PIES
|
|
2
|
GD, Dumontier M, Lieu HC, Betel D, Isserlin R, Hogue CW: SeqHound: biological sequence and structure database as a platform for bioinformatics research
– Michalickova, Bader
|
|
1
|
JN and Hunter L EDGAR: extraction of drugs, genes and relations from the biomedical literature Pac Symp Biocomput 2000
– TC, Tanabe, et al.
|
|
1
|
Rechenmann F and Julliard L A pragmatic information extraction strategy for gathering data on genetic
– Proux
|
|
1
|
SJ and Kans JA The NCBI data model Bioinformatics (Edited by
– JM, Wheelan
|
|
1
|
SJ and Kans JA The NCBI data model Methods Biochem Anal 2001
– JM, Wheelan
|
|
1
|
Kachites Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering http://www.cs.cmu.edu/~mccallum/bow 1996, Publish with BioMed Central and every scientist can read your work free of charge "BioMed Central will
– Andrew
|