Results 1 - 10
of
11
Mixing multiple translation models in statistical machine translation
- In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju, Republic of Korea
, 2012
"... Statistical machine translation is often faced with the problem of combining training data from many diverse sources into a single translation model which then has to translate sentences in a new domain. We propose a novel approach, ensemble decoding, which combines a number of translation systems d ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Statistical machine translation is often faced with the problem of combining training data from many diverse sources into a single translation model which then has to translate sentences in a new domain. We propose a novel approach, ensemble decoding, which combines a number of translation systems dynamically at the decoding step. In this paper, we evaluate performance on a domain adaptation setting where we translate sentences from the medical domain. Our experimental results show that ensemble decoding outperforms various strong baselines including mixture models, the current state-of-the-art for domain adaptation in machine translation. 1
A New Minimally-Supervised Framework for Domain Word Sense Disambiguation
"... We present a new minimally-supervised framework for performing domain-driven Word Sense Disambiguation (WSD). Glossaries for several domains are iteratively acquired from the Web by means of a bootstrapping technique. The acquired glosses are then used as the sense inventory for fullyunsupervised do ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We present a new minimally-supervised framework for performing domain-driven Word Sense Disambiguation (WSD). Glossaries for several domains are iteratively acquired from the Web by means of a bootstrapping technique. The acquired glosses are then used as the sense inventory for fullyunsupervised domain WSD. Our experiments, on new and gold-standard datasets, show that our wide-coverage framework enables highperformance results on dozens of domains at a coarse and fine-grained level. 1
Perplexity Minimization for Translation Model Domain Adaptation in Statistical Machine Translation
"... We investigate the problem of domain adaptation for parallel data in Statistical Machine Translation (SMT). While techniques for domain adaptation of monolingual data can be borrowed for parallel data, we explore conceptual differences between translation model and language model domain adaptation a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We investigate the problem of domain adaptation for parallel data in Statistical Machine Translation (SMT). While techniques for domain adaptation of monolingual data can be borrowed for parallel data, we explore conceptual differences between translation model and language model domain adaptation and their effect on performance, such as the fact that translation models typically consist of several features that have different characteristics and can be optimized separately. We also explore adapting multiple (4–10) data sets with no a priori distinction between in-domain and out-of-domain data except for an in-domain development set. 1
Discriminative Feature-Tied Mixture Modeling for Statistical Machine Translation
"... In this paper we present a novel discriminative mixture model for statistical machine translation (SMT). We model the feature space with a log-linear combination of multiple mixture components. Each component contains a large set of features trained in a maximumentropy framework. All features within ..."
Abstract
- Add to MetaCart
In this paper we present a novel discriminative mixture model for statistical machine translation (SMT). We model the feature space with a log-linear combination of multiple mixture components. Each component contains a large set of features trained in a maximumentropy framework. All features within the same mixture component are tied and share the same mixture weights, where the mixture weights are trained discriminatively to maximize the translation performance. This approach aims at bridging the gap between the maximum-likelihood training and the discriminative training for SMT. It is shown that the feature space can be partitioned in a variety of ways, such as based on feature types, word alignments, or domains, for various applications. The proposed approach improves the translation performance significantly on a large-scale Arabic-to-English MT task. 1
M.J. Castro-Bleda
"... This paper describes the system presented for the English-Spanish translation task by the collaboration between CEU-UCH and UPV for 2011 WMT. A comparison of independent phrase-based translation models interpolation for each available training corpora were tested, giving an improvement of 0.4 BLEU p ..."
Abstract
- Add to MetaCart
This paper describes the system presented for the English-Spanish translation task by the collaboration between CEU-UCH and UPV for 2011 WMT. A comparison of independent phrase-based translation models interpolation for each available training corpora were tested, giving an improvement of 0.4 BLEU points over the baseline. Output N-best lists were rescored via a target Neural Network Language Model. An improvement of one BLEU point over the baseline was obtained adding the two features, giving 31.5 BLEU and 57.9 TER for the primary system, computed over lowercased and detokenized outputs. The system was positioned second in the final ranking. 1
ISV
"... Instance-weighting has been shown to be effective in statistical machine translation (Foster et al., 2010), as well as crosslanguage adaptation of dependency parsers (Søgaard, 2011). This paper presents new methods to do instance-weighting in stateof-the-art dependency parsers. The methods are evalu ..."
Abstract
- Add to MetaCart
Instance-weighting has been shown to be effective in statistical machine translation (Foster et al., 2010), as well as crosslanguage adaptation of dependency parsers (Søgaard, 2011). This paper presents new methods to do instance-weighting in stateof-the-art dependency parsers. The methods are evaluated on Danish and English data with consistent improvements over unadapted baselines. 1
IWPT 2011 Proceedings of the 12th International Conference on Parsing Technologies
, 2011
"... Interest Group on Parsing, serving as the primary specialized forum for research on natural language parsing. This year we received a total of 64 valid submissions, 42 long papers and 22 short papers, 6 of which were later withdrawn after being accepted for publication elsewhere. Of the remaining 58 ..."
Abstract
- Add to MetaCart
Interest Group on Parsing, serving as the primary specialized forum for research on natural language parsing. This year we received a total of 64 valid submissions, 42 long papers and 22 short papers, 6 of which were later withdrawn after being accepted for publication elsewhere. Of the remaining 58 submissions, 28 were accepted for presentation at the conference, which gives an acceptance rate of 48%. After notification, 2 more papers were withdrawn, which brings the final number of accepted papers to 26, all of which are published in these proceedings and presented at the conference in one of two ways: (i) as a long talk (long papers only) or (ii) as a short talk and a poster (short papers and some long papers). In this way, we were able to accommodate as many papers as possible and still give all the authors the opportunity of an oral presentation. In addition to the contributed papers, IWPT 2011 will as usual feature invited talks on topics relevant to natural language parsing. This year we are delighted to welcome three very distinguished researchers: Ina Bornkessel-Schlesewsky, Michael Collins, and Mark Steedman. You will find the abstracts of their talks in the proceedings. There will also be a special workshop devoted to parsing of morphologically rich languages on the second day of the conference, a workshop that has had its own program committee
LIMSI @ WMT11
"... This paper describes LIMSI’s submissions to the Sixth Workshop on Statistical Machine Translation. We report results for the French-English and German-English shared translation tasks in both directions. Our systems use n-code, an open source Statistical Machine Translation system based on bilingual ..."
Abstract
- Add to MetaCart
This paper describes LIMSI’s submissions to the Sixth Workshop on Statistical Machine Translation. We report results for the French-English and German-English shared translation tasks in both directions. Our systems use n-code, an open source Statistical Machine Translation system based on bilingual n-grams. For the French-English task, we focussed on finding efficient ways to take advantage of the large and heterogeneous training parallel data. In particular, using a simple filtering strategy helped to improve both processing time and translation quality. To translate from English to French and German, we also investigated the use of the SOUL language model in Machine Translation and showed significant improvements with a 10-gram SOUL model. We also briefly report experiments with several alternatives to the standard n-best MERT procedure, leading to a significant speed-up.
Does more data always yield better translations?
"... Nowadays, there are large amounts of data available to train statistical machine translation systems. However, it is not clear whether all the training data actually help or not. A system trained on a subset of such huge bilingual corpora might outperform the use of all the bilingual data. This pape ..."
Abstract
- Add to MetaCart
Nowadays, there are large amounts of data available to train statistical machine translation systems. However, it is not clear whether all the training data actually help or not. A system trained on a subset of such huge bilingual corpora might outperform the use of all the bilingual data. This paper studies such issues by analysing two training data selection techniques: one based on approximating the probability of an indomain corpus; and another based on infrequent n-gram occurrence. Experimental results not only report significant improvements over random sentence selection but also an improvement over a system trained with the whole available data. Surprisingly, the improvements are obtained with just a small fraction of the data that accounts for less than 0.5 % of the sentences. Afterwards, we show that a much larger room for improvement exists, although this is done under non-realistic conditions. 1
Adapting Translation Models to Translationese Improves SMT
"... Translation models used for statistical machine translation are compiled from parallel corpora; such corpora are manually translated, but the direction of translation is usually unknown, and is consequently ignored. However, much research in Translation Studies indicates that the direction of transl ..."
Abstract
- Add to MetaCart
Translation models used for statistical machine translation are compiled from parallel corpora; such corpora are manually translated, but the direction of translation is usually unknown, and is consequently ignored. However, much research in Translation Studies indicates that the direction of translation matters, as translated language (translationese) has many unique properties. Specifically, phrase tables constructed from parallel corpora translated in the same direction as the translation task perform better than ones constructed from corpora translated in the opposite direction. We reconfirm that this is indeed the case, but emphasize the importance of using also texts translated in the ‘wrong ’ direction. We take advantage of information pertaining to the direction of translation in constructing phrase tables, by adapting the translation model to the special properties of translationese. We define entropybased measures that estimate the correspondence of target-language phrases to translationese, thereby eliminating the need to annotate the parallel corpus with information pertaining to the direction of translation. We show that incorporating these measures as features in the phrase tables of statistical machine translation systems results in consistent, statistically significant improvement in the quality of the translation.

