Results 1 -
2 of
2
Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with theℓ0-norm
"... Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. Although many models have surpassed them in accuracy, none has supplanted the ..."
Abstract
- Add to MetaCart
Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. Although many models have surpassed them in accuracy, none has supplanted them in practice. In this paper, we propose a simple extension to the IBM models: anℓ0 prior to encourage sparsity in the word-to-word translation model. We explain how to implement this extension efficiently for large-scale data (to be released as a modification to GIZA++) and demonstrate, in experiments on Czech, Arabic, Chinese, and Urdu to English translation, significant improvements over IBM Model 4 in both word alignment (up to+6.7 F1) and translation quality (up to+1.4 Bleu). 1
Improving the IBM Alignment Models Using Variational Bayes
"... Bayesian approaches have been shown to reduce the amount of overfitting that occurs when running the EM algorithm, by placing prior probabilities on the model parameters. We apply one such Bayesian technique, variational Bayes, to the IBM models of word alignment for statistical machine translation. ..."
Abstract
- Add to MetaCart
Bayesian approaches have been shown to reduce the amount of overfitting that occurs when running the EM algorithm, by placing prior probabilities on the model parameters. We apply one such Bayesian technique, variational Bayes, to the IBM models of word alignment for statistical machine translation. We show that using variational Bayes improves the performance of the widely used GIZA++ software, as well as improving the overall performance of the Moses machine translation system in terms of BLEU score. 1

