Word Sense Disambiguation Using a Second Language Monolingual Corpus (1994)
Cached
Download Links
- [www.cs.technion.ac.il]
- [acl.ldc.upenn.edu]
- DBLP
Other Repositories/Bibliography
| Venue: | Computational Linguistics |
| Citations: | 129 - 1 self |
BibTeX
@ARTICLE{Dagan94wordsense,
author = {Ido Dagan and Alon Itai},
title = {Word Sense Disambiguation Using a Second Language Monolingual Corpus},
journal = {Computational Linguistics},
year = {1994},
volume = {20},
pages = {563--596}
}
Years of Citing Articles
OpenURL
Abstract
This paper presents a new approach for resolving lexical ambiguities in one language using statistical data from a monolingual corpus of another language. This approach exploits the differences between mappings of words to senses in different languages. The paper concentrates on the problem of target word selection in machine translation, for which the approach is directly applicable. The presented algorithm identifies syntactic relationships between words, using a source language parser, and maps the alternative interpretations of these relationships to the target language, using a bilingual lexicon. The preferred senses are then selected according to statistics on lexical relations in the target language. The selection is based on a statistical model and on a constraint propagation algorithm, which handles simultaneously all ambiguities in the sentence. The method was evaluated using three sets of Hebrew and German examples and was found to be very useful for disambiguation. The paper includes a detailed comparative analysis of statistical sense disambiguation methods. 1. Introduction The resolution of lexical ambiguities in non-restricted text is one of the most difficult tasks of natural language processing. A related task in machine translation, on which we focus in this paper, is target word selection. This is the task of deciding which target language word is the most appropriate equivalent of a source language word in context. In addition to the alternatives introduced by the different word senses of the source language word, the target language may specify additional alternatives that differ mainly in their usage. Traditionally several linguistic levels were used to deal with this problem: syntactic, semantic and pragmatic. Computationally the syntactic methods...







