Improving statistical machine translation using word sense disambiguation (2007)
Cached
Download Links
| Venue: | In The 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007 |
| Citations: | 49 - 5 self |
BibTeX
@INPROCEEDINGS{Carpuat07improvingstatistical,
author = {Marine Carpuat and Dekai Wu},
title = {Improving statistical machine translation using word sense disambiguation},
booktitle = {In The 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007},
year = {2007},
pages = {61--72}
}
OpenURL
Abstract
We show for the first time that incorporating the predictions of a word sense disambiguation system within a typical phrase-based statistical machine translation (SMT) model consistently improves translation quality across all three different IWSLT Chinese-English test sets, as well as producing statistically significant improvements on the larger NIST Chinese-English MT task— and moreover never hurts performance on any test set, according not only to BLEU but to all eight most commonly used automatic evaluation metrics. Recent work has challenged the assumption that word sense disambiguation (WSD) systems are useful for SMT. Yet SMT translation quality still obviously suffers from inaccurate lexical choice. In this paper, we address this problem by investigating a new strategy for integrating WSD into an SMT system, that performs fully phrasal multi-word disambiguation. Instead of directly incorporating a Senseval-style WSD system, we redefine the WSD task to match the exact same phrasal translation disambiguation task faced by phrase-based SMT systems. Our results provide the first known empirical evidence that lexical semantics are indeed useful for SMT, despite claims to the contrary.







