Results 1 -
6 of
6
Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation
"... We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting b ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting by using a finer granularity, focusing on the properties of instances rather than corpus components, and using a simpler training procedure. We incorporate instance weighting into a mixture-model framework, and find that it yields consistent improvements over a wide range of baselines. 1
Better hypothesis testing for statistical machine translation: Controlling for optimizer instability
- In Proc. of ACL
, 2011
"... In statistical machine translation, a researcher seeks to determine whether some innovation (e.g., a new feature, model, or inference algorithm) improves translation quality in comparison to a baseline system. To answer this question, he runs an experiment to evaluate the behavior of the two systems ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
In statistical machine translation, a researcher seeks to determine whether some innovation (e.g., a new feature, model, or inference algorithm) improves translation quality in comparison to a baseline system. To answer this question, he runs an experiment to evaluate the behavior of the two systems on held-out data. In this paper, we consider how to make such experiments more statistically reliable. We provide a systematic analysis of the effects of optimizer instability—an extraneous variable that is seldom controlled for—on experimental outcomes, and make recommendations for reporting results more accurately. 1
Structured Ramp Loss Minimization for Machine Translation
"... This paper seeks to close the gap between training algorithms used in statistical machine translation and machine learning, specifically the framework of empirical risk minimization. We review well-known algorithms, arguing that they do not optimize the loss functions they are assumed to optimize wh ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper seeks to close the gap between training algorithms used in statistical machine translation and machine learning, specifically the framework of empirical risk minimization. We review well-known algorithms, arguing that they do not optimize the loss functions they are assumed to optimize when applied to machine translation. Instead, most have implicit connections to particular forms of ramp loss. We propose to minimize ramp loss directly and present a training algorithm that is easy to implement and that performs comparably to others. Most notably, our structured ramp loss minimization algorithm, RAMPION, is less sensitive to initialization and random seeds than standard approaches. 1
The CMU-ARK German-English Translation System
"... This paper describes the German-English translation system developed by the ARK research group at Carnegie Mellon University for the Sixth Workshop on Machine Translation (WMT11). We present the results of several modeling and training improvements to our core hierarchical phrase-based translation s ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper describes the German-English translation system developed by the ARK research group at Carnegie Mellon University for the Sixth Workshop on Machine Translation (WMT11). We present the results of several modeling and training improvements to our core hierarchical phrase-based translation system, including: feature engineering to improve modeling of the derivation structure of translations; better handing of OOVs; and using development set translations into other languages to create additional pseudoreferences for training. 1
Maximum Rank Correlation Training for Statistical Machine Translation
"... We propose Maximum Ranking Correlation (MRC) as an objective function in discriminative tuning of parameters in a linear model of Statistical Machine Translation (SMT). We try to maximize the ranking correlation between sentence level BLEU (SBLEU) scores and model scores of the N-best list, while th ..."
Abstract
- Add to MetaCart
We propose Maximum Ranking Correlation (MRC) as an objective function in discriminative tuning of parameters in a linear model of Statistical Machine Translation (SMT). We try to maximize the ranking correlation between sentence level BLEU (SBLEU) scores and model scores of the N-best list, while the MERT paradigm focuses on the potential 1-best candidates of the N-best list. After we optimize the MER and the MRC objectives using an multiple objective optimization algorithm at the same time, we interpolate them to obtain parameters which outperform both. Experimental results on WMT French–English data set confirm that our method significantly outperforms MERT on out-of-domain data sets, and performs marginally better than MERT on in-domain data sets, which validates the usefulness of MRC on both domain specific and general domain data.
Optimizing for Sentence-Level BLEU+1 Yields Short Translations
"... We study a problem with pairwise ranking optimization (PRO): that it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO’s BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balan ..."
Abstract
- Add to MetaCart
We study a problem with pairwise ranking optimization (PRO): that it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO’s BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).

