Results 1 -
4 of
4
Pitman-Yor Process-Based Language Models for Machine Translation
"... The hierarchical Pitman-Yor process-based smoothing method applied to language model was proposed by Goldwater and by Teh; the performance of this smoothing method is shown comparable with the modified Kneser-Ney method in terms of perplexity. Although this method was presented four years ago, there ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The hierarchical Pitman-Yor process-based smoothing method applied to language model was proposed by Goldwater and by Teh; the performance of this smoothing method is shown comparable with the modified Kneser-Ney method in terms of perplexity. Although this method was presented four years ago, there has been no paper which reports that this language model indeed improves translation quality in the context of Machine Translation (MT). This is important for the MT community since an improvement in perplexity does not always lead to an improvement in BLEU score; for example, the success of word alignment measured by Alignment Error Rate (AER) does not often lead to an improvement in BLEU. This paper reports in the context of MT that an improvement in perplexity really leads to an improvement in BLEU score. It turned out that an application of the Hierarchical Pitman-Yor Language Model (HPYLM) requires a minor change in the conventional decoding process. Additionally to this, we propose a new Pitman-Yor process-based statistical smoothing method similar to the Good-Turing method although the performance of this is inferior to HPYLM. We conducted experiments; HPYLM improved by 1.03 BLEU points absolute and 6 % relative for 50k EN-JP, which was statistically significant.
Workshop on Applications of Pattern Analysis Gap Between Theory and Practice: Noise Sensitive Word Alignment in Machine Translation
"... Word alignment is to estimate a lexical translation probability p(e|f), or to estimate the correspondence g(e, f) where a function g outputs either 0 or 1, between a source word f and a target word e for given bilingual sentences. In practice, this formulation does not consider the existence of ‘noi ..."
Abstract
- Add to MetaCart
Word alignment is to estimate a lexical translation probability p(e|f), or to estimate the correspondence g(e, f) where a function g outputs either 0 or 1, between a source word f and a target word e for given bilingual sentences. In practice, this formulation does not consider the existence of ‘noise ’ (or outlier) which may cause problems depending on the corpus. N-to-m mapping objects, such as paraphrases, non-literal translations, and multiword expressions, may appear as both noise and also as valid training data. From this perspective, this paper tries to answer the following two questions: 1) how to detect stable patterns where noise seems legitimate, and 2) how to reduce such noise, where applicable, by supplying extra information as prior knowledge to a word aligner. Keywords: Probability density estimation problem, Noise. 1.
Statistical Machine Translation with Factored Translation Model: MWEs, Separation of Affixes, and Others
"... This paper discusses Statistical Machine Translation when the target side is morphologically richer language. This paper intends to discuss the issues which are not covered by a factored translation model of Moses especially targetting EN–JP translation: the effect of Multi-Word Expressions, the sep ..."
Abstract
- Add to MetaCart
This paper discusses Statistical Machine Translation when the target side is morphologically richer language. This paper intends to discuss the issues which are not covered by a factored translation model of Moses especially targetting EN–JP translation: the effect of Multi-Word Expressions, the separation of affixes, and other monolingual morphological issues. We intend to discuss these over a factored translation model.
Given Bilingual Terminology in Statistical Machine Translation: MWE-sensitve Word Alignment and Hierarchical Pitman-Yor Process-based Translation Model Smoothing
"... This paper considers a scenario when we are given almost perfect knowledge about bilingual terminology in terms of a test corpus in Statistical Machine Translation (SMT). When the given terminology is part of a training corpus, one natural strategy in SMT is to use the trained translation model igno ..."
Abstract
- Add to MetaCart
This paper considers a scenario when we are given almost perfect knowledge about bilingual terminology in terms of a test corpus in Statistical Machine Translation (SMT). When the given terminology is part of a training corpus, one natural strategy in SMT is to use the trained translation model ignoring the given terminology. Then, two questions arises here. 1) Can a word aligner capture the given terminology? This is since even if the terminology is in a training corpus, it is often the case that a resulted translation model may not include these terminology. 2) Are probabilities in a translation model correctly calculated? In order to answer these questions, we did experiment introducing a Multi-Word Expression-sensitive (MWEsensitive) word aligner and a hierarchical Pitman-Yor process-based translation model smoothing. Using 200k JP–EN NTCIR corpus, our experimental results show that if we introduce an MWE-sensitive word aligner and a new translation model smoothing, the overall improvement was 1.35 BLEU point absolute and 6.0 % relative compared to the case we do not introduce these two.

