Identifying Cross Language Term Equivalents Using Statistical Machine Translation and Distributional Association Measures
BibTeX
@MISC{Hjelm_identifyingcross,
author = {Hans Hjelm},
title = {Identifying Cross Language Term Equivalents Using Statistical Machine Translation and Distributional Association Measures},
year = {}
}
OpenURL
Abstract
This article presents a comparison of the accuracy of a number of different approaches for identifying cross language term equivalents (translations). The methods investigated are on the one hand associative measures, commonly used in word-space models or in Information Retrieval and on the other hand a Statistical Machine Translation (SMT) approach. I have performed tests on six language pairs, using the JRC-Acquis parallel corpus as training material and Eurovoc as a gold standard. The SMT approach is shown to be more effective than the associative measures. The best results are achieved by taking a weighted average of the scores of the SMT approach and disparate associative measures. 1







