Results 1 - 10
of
17
Incorporating non-local information into information extraction systems by gibbs sampling
- In ACL
, 2005
"... Most current statistical natural language processing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sampling, ..."
Abstract
-
Cited by 192 (15 self)
- Add to MetaCart
Most current statistical natural language processing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sampling, a simple Monte Carlo method used to perform approximate inference in factored probabilistic models. By using simulated annealing in place of Viterbi decoding in sequence models such as HMMs, CMMs, and CRFs, it is possible to incorporate non-local structure while preserving tractable inference. We use this technique to augment an existing CRF-based information extraction system with long-distance dependency models, enforcing label consistency and extraction template consistency constraints. This technique results in an error reduction of up to 9 % over state-of-the-art systems on two established information extraction tasks. 1
Recognizing names in biomedical texts: A machine learning approach
- Bioinformatics
, 2004
"... Motivation: With an overwhelming amount of textual information in molecular biology and biomedicine, there is a need for effective and efficient literature mining and knowledge discovery that can help biologists to gather and make use of the knowledge encoded in text documents. In order to make orga ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
Motivation: With an overwhelming amount of textual information in molecular biology and biomedicine, there is a need for effective and efficient literature mining and knowledge discovery that can help biologists to gather and make use of the knowledge encoded in text documents. In order to make organized and structured information available, automatically recognizing biomedical entity names becomes critical and is important for information retrieval, information extraction and automated knowledge acquisition. Results: In this paper, we present a named entity recognition system in the biomedical domain, called PowerBioNE. In order to deal with the special phenomena of naming conventions in the biomedical domain, we
An effective two-stage model for exploiting non-local dependencies in named entity recognition
- In ACL-COLING’06: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics
, 2006
"... This paper shows that a simple two-stage approach to handle non-local dependencies in Named Entity Recognition (NER) can outperform existing approaches that handle non-local dependencies, while being much more computationally efficient. NER systems typically use sequence models for tractable inferen ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
This paper shows that a simple two-stage approach to handle non-local dependencies in Named Entity Recognition (NER) can outperform existing approaches that handle non-local dependencies, while being much more computationally efficient. NER systems typically use sequence models for tractable inference, but this makes them unable to capture the long distance structure present in text. We use a Conditional Random Field (CRF) based NER system using local features to make predictions and then train another CRF which uses both local information and features extracted from the output of the first CRF. Using features capturing non-local dependencies from the same document, our approach yields a 12.6 % relative error reduction on the F1 score, over state-of-theart NER systems using local-information alone, when compared to the 9.3 % relative error reduction offered by the best systems that exploit non-local information. Our approach also makes it easy to incorporate non-local information from other documents in the test corpus, and this gives us a 13.3 % error reduction over NER systems using local-information alone. Additionally, our running time for inference is just the inference time of two sequential CRFs, which is much less than that of other more complicated approaches that directly model the dependencies and do approximate inference. 1
SVM Based Learning System For Information Extraction
- In Proceedings of Sheffield Machine Learning Workshop, Lecture Notes in Computer Science
, 2005
"... Abstract. This paper presents an SVM-based learning system for information extraction (IE). One distinctive feature of our system is the use of a variant of the SVM, the SVM with uneven margins, which is particularly helpful for small training datasets. In addition, our approach needs fewer SVM clas ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
Abstract. This paper presents an SVM-based learning system for information extraction (IE). One distinctive feature of our system is the use of a variant of the SVM, the SVM with uneven margins, which is particularly helpful for small training datasets. In addition, our approach needs fewer SVM classifiers to be trained than other recent SVM-based systems. The paper also compares our approach to several state-of-theart systems (including rule learning and statistical learning algorithms) on three IE benchmark datasets: CoNLL-2003, CMU seminars, and the software jobs corpus. The experimental results show that our system outperforms a recent SVM-based system on CoNLL-2003, achieves the highest score on eight out of 17 categories on the jobs corpus, and is second best on the remaining nine. 1
Protein-Protein Interaction Extraction: A Supervised Learning Approach
- In Proc Symp on Semantic Mining in Biomedicine
, 2005
"... In this paper, we propose using Maximum Entropy to extract protein-protein interaction information from the literature, which overcomes the limitation of the state of art co-occurrence based and rule-based approaches. It incorporates corpus statistics of various lexical, syntactic and semantic featu ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this paper, we propose using Maximum Entropy to extract protein-protein interaction information from the literature, which overcomes the limitation of the state of art co-occurrence based and rule-based approaches. It incorporates corpus statistics of various lexical, syntactic and semantic features. We find that the use of shallow lexical features contributes a large portion of performance improvements in contrast to the use of parsing or partial parsing information. Yet such lexical features have never been used before in other PPI extraction systems. As a result, such a new approach achieves a very encouraging result of 93.9 % recall and 88.0% precision on IEPA corpus provided. To the best of our knowledge, not only is this the first systematic study of supervised learning and the first attempt of feature-based supervised learning for PPI extraction, but it also provides useful features, such as surrounding words, key words and abbreviations, to extend the supervised learning capability for relation extraction to other domains such as news. 1.
Machine Learning for Information Extraction in Genomics - State of the Art and Perspectives, Text Mining and its Applications
- Results of the NEMIS Launch Conference Series: Studies in Fuzziness and Soft Computing, Sirmakessis, Spiros (Ed.), Springer Verlag. Nédellec C., Ould Abdel Vetah M. and Bessières P
, 2004
"... The considerable development of multimedia communication goes along with an exponentially increasing volume of ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
The considerable development of multimedia communication goes along with an exponentially increasing volume of
Context and Domain Knowledge Enhanced Entity Spotting In Informal Text
"... Abstract. This paper explores the application of restricted relationship graphs (RDF) and statistical NLP techniques to improve named entity annotation in challenging Informal English domains. We validate our approach using on-line forums discussing popular music. Named entity annotation is particul ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. This paper explores the application of restricted relationship graphs (RDF) and statistical NLP techniques to improve named entity annotation in challenging Informal English domains. We validate our approach using on-line forums discussing popular music. Named entity annotation is particularly difficult in this domain because it is characterized by a large number of ambiguous entities, such as the Madonna album “Music ” or Lilly Allen’s pop hit “Smile”. We evaluate improvements in annotation accuracy that can be obtained by restricting the set of possible entities using real-world constraints. We find that constrained domain entity extraction raises the annotation accuracy significantly, making an infeasible task practical. We then show that we can further improve annotation accuracy by over 50 % by applying SVM based NLP systems trained on word-usages in this domain. 1
M.: A systematic cross-comparison of sequence classifiers
- In: SDM 2006
, 2006
"... In the CoNLL 2003 NER shared task, more than two thirds of the submitted systems used a feature-rich representation of the task. Most of them used the maximum entropy principle to combine the features together. Others used large margin linear classifiers, such as SVM and RRM. In this paper, we compa ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In the CoNLL 2003 NER shared task, more than two thirds of the submitted systems used a feature-rich representation of the task. Most of them used the maximum entropy principle to combine the features together. Others used large margin linear classifiers, such as SVM and RRM. In this paper, we compare several common classifiers under exactly the same conditions, demonstrating that the ranking of systems in the shared task is due to feature selection and other causes and not due to inherent qualities of the algorithms, which should be ranked otherwise. We demonstrate that whole-sequence models generally outperform local models, and that large margin classifiers generally outperform maximum entropy-based classifiers. 1
NIL is not Nothing: Recognition of Chinese Network Informal Language Expressions
- 4th SIGHAN Workshop at IJCNLP'05
, 2005
"... Informal language is actively used in network-mediated communication, e.g. chat room, BBS, email and text message. We refer the anomalous terms used in such context as network informal language (NIL) expressions. For example, “�(ou3) ” is used to replace “�(wo3) ” in Chinese ICQ. Without unconventio ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Informal language is actively used in network-mediated communication, e.g. chat room, BBS, email and text message. We refer the anomalous terms used in such context as network informal language (NIL) expressions. For example, “�(ou3) ” is used to replace “�(wo3) ” in Chinese ICQ. Without unconventional resource, knowledge and techniques, the existing natural language processing approaches exhibit less effectiveness in dealing with NIL text. We propose to study NIL expressions with a NIL corpus and investigate techniques in processing NIL expressions. Two methods for Chinese NIL expression

