Results 1 -
2 of
2
Language Independent Methodologies to Tackle Multilinguality
"... Until now, Natural Language Processing (NLP) research development has mainly been conducted for the English speaking community. However, the European Union with its 25 member-states already involves 22 different official languages. As a consequence, multilinguality is certainly the most important ch ..."
Abstract
- Add to MetaCart
Until now, Natural Language Processing (NLP) research development has mainly been conducted for the English speaking community. However, the European Union with its 25 member-states already involves 22 different official languages. As a consequence, multilinguality is certainly the most important challenge of this century for the European NLP community. In this paper, we show how the Centre for Human Language Technology and Bioinformatics has been dealing with the problem of multilinguality by proposing language independent systems instead of language tailored architectures. 1.
Propositional Term Extraction over Short Text using Word Cohesiveness and Conditional Random Fields with Multi-Level Features
"... Propositional terms in a research abstract (RA) generally convey the most important information for readers to quickly glean the contribution of a research article. This paper considers propositional term extraction from RAs as a sequence labeling task using the IOB (Inside, Outside, Beginning) enco ..."
Abstract
- Add to MetaCart
Propositional terms in a research abstract (RA) generally convey the most important information for readers to quickly glean the contribution of a research article. This paper considers propositional term extraction from RAs as a sequence labeling task using the IOB (Inside, Outside, Beginning) encoding scheme. In this study, conditional random fields (CRFs) are used to initially detect the propositional terms, and the combined association measure (CAM) is applied to further adjust the term boundaries. This method can extract beyond simply NP-based propositional terms by combining multi-level features and inner lexical cohesion. Experimental results show that CRFs can significantly increase the recall rate of imperfect boundary term extraction and the CAM can further effectively improve the term boundaries.

