Results 1 -
3 of
3
Comparative Experiments on Learning Information Extractors for Proteins and their Interactions
, 2004
"... Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in M ..."
Abstract
-
Cited by 55 (7 self)
- Add to MetaCart
Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in Medline. However, extraction eorts have been frustrated by the lack of conventions for describing human genes and proteins. We have developed and evaluated a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting information on interactions between the proteins. We demonstrate that machine learning approaches using support vector machines and maximum entropy are able to identify human proteins with higher accuracy than several previous approaches. We also demonstrate that various rule induction methods are able to identify protein interactions with higher precision than manually-developed rules.
Learning to Extract Proteins and their Interactions from Medline Abstracts
- In: ICML-2003 Workshop on Machine Learning in Bioinformatics. (2003
, 2003
"... We present results from a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting interactions between the proteins. We demonstrate that machine learning approaches using support vector machines and hidden Markov m ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We present results from a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting interactions between the proteins. We demonstrate that machine learning approaches using support vector machines and hidden Markov models are able to identify human proteins with higher accuracy than several previous approaches. We also demonstrate that various rule induction methods are able to identify protein interactions with higher precision than manually-developed rules.
Annotation Guidelines for Machine Learning-Based Named Entity Recognition in Microbiology
"... Abstract. Recent challenges on machine learning application to named-entity recognition in biology trigger discussions on the manual annotation guidelines for annotating the learning corpora. Some sources of potential inconsistency have been identified by corpus annotators and challenge participants ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Recent challenges on machine learning application to named-entity recognition in biology trigger discussions on the manual annotation guidelines for annotating the learning corpora. Some sources of potential inconsistency have been identified by corpus annotators and challenge participants. We go one step further by proposing specific annotation guidelines for biology and evaluating their effect on performances of machine learning methods. We show that a significant improvement can be achieved by this way that is not due to the feature set neither to the ML methods. Keywords: Named-entity recognition, annotation guidelines, machine learning, biology. 1.

