Results 1 -
1 of
1
!262 Improving Statistical Machine Translation Accuracy Using Bilingual Lexicon Extraction with Paraphrases
"... Statistical machine translation (SMT) suffers from the accuracy problem that the translation pairs and their feature scores in the transla-tion model can be inaccurate. The accuracy problem is caused by the quality of the unsu-pervised methods used for translation model learning. Previous studies pr ..."
Abstract
- Add to MetaCart
(Show Context)
Statistical machine translation (SMT) suffers from the accuracy problem that the translation pairs and their feature scores in the transla-tion model can be inaccurate. The accuracy problem is caused by the quality of the unsu-pervised methods used for translation model learning. Previous studies propose estimating comparable features for the translation pairs in the translation model from comparable cor-pora, to improve the accuracy of the transla-tion model. Comparable feature estimation is based on bilingual lexicon extraction (BLE) technology. However, BLE suffers from the data sparseness problem, which makes the comparable features inaccurate. In this paper, we propose using paraphrases to address this problem. Paraphrases are used to smooth the vectors used in comparable feature estimation with BLE. In this way, we improve the qual-ity of comparable features, which can improve the accuracy of the translation model thus im-prove SMT performance. Experiments con-ducted on Chinese-English phrase-based SMT (PBSMT) verify the effectiveness of our pro-posed method. 1