Results 1 -
3 of
3
Using Monolingual Human Computation to Improve Language Translation via Targeted Paraphrase
"... We introduce a new approach to the problem of obtaining cost-effective, reasonable quality translation, by exploiting simple and inexpensive human computations by monolingual speakers. The key insight behind the process is that it is possible to to spot likely translation errors with only monolingua ..."
Abstract
- Add to MetaCart
We introduce a new approach to the problem of obtaining cost-effective, reasonable quality translation, by exploiting simple and inexpensive human computations by monolingual speakers. The key insight behind the process is that it is possible to to spot likely translation errors with only monolingual knowledge of the target language, and it is possible to generate new ways to say the same thing (i.e. paraphrases) with only monolingual knowledge of the source language. Initial evaluation demonstrates substantial improvements in translation quality. 1.
Improved Statistical Machine Translation for Resource-Poor Languages Using Related Resource-Rich Languages
"... We propose a novel language-independent approach for improving statistical machine translation for resource-poor languages by exploiting their similarity to resource-rich ones. More precisely, we improve the translation from a resourcepoor source language X1 into a resourcerich language Y given a bi ..."
Abstract
- Add to MetaCart
We propose a novel language-independent approach for improving statistical machine translation for resource-poor languages by exploiting their similarity to resource-rich ones. More precisely, we improve the translation from a resourcepoor source language X1 into a resourcerich language Y given a bi-text containing a limited number of parallel sentences for X1-Y and a larger bi-text for X2-Y for some resource-rich language X2 that is closely related to X1. The evaluation for Indonesian→English (using Malay) and Spanish→English (using Portuguese and pretending Spanish is resource-poor) shows an absolute gain of up to 1.35 and 3.37 Bleu points, respectively, which is an improvement over the rivaling approaches, while using much less additional data. 1
Improved Statistical Machine Translation with Hybrid Phrasal Paraphrases Derived from Monolingual Text and a Shallow Lexical Resource
"... Paraphrase generation is useful for various NLP tasks. But pivoting techniques for paraphrasing have limited applicability due to their reliance on parallel texts, although they benefit from linguistic knowledge implicit in the sentence alignment. Distributional paraphrasing has wider applicability, ..."
Abstract
- Add to MetaCart
Paraphrase generation is useful for various NLP tasks. But pivoting techniques for paraphrasing have limited applicability due to their reliance on parallel texts, although they benefit from linguistic knowledge implicit in the sentence alignment. Distributional paraphrasing has wider applicability, but doesn’t benefit from any linguistic knowledge. We combine a distributional semantic distance measure (based on a non-annotated corpus) with a shallow linguistic resource to create a hybrid semantic distance measure of words, which we extend to phrases. We embed this extended hybrid measure in a distributional paraphrasing technique, benefiting from both linguistic knowledge and independence from parallel texts. Evaluated in statistical machine translation tasks by augmenting translation models with paraphrase-based translation rules, we show our novel technique is superior to the non-augmented baseline and both the distributional and pivot paraphrasing techniques. We train models on both a full-size dataset as well as a simulated “low density ” small dataset. 1

