Results 1 -
3 of
3
The Best Lexical Metric for Phrase-Based Statistical MT System Optimization
"... Translation systems are generally trained to optimize BLEU, but many alternative metrics are available. We explore how optimizing toward various automatic evaluation metrics (BLEU, METEOR, NIST, TER) affects the resulting model. We train a state-of-the-art MT system using MERT on many parameterizati ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Translation systems are generally trained to optimize BLEU, but many alternative metrics are available. We explore how optimizing toward various automatic evaluation metrics (BLEU, METEOR, NIST, TER) affects the resulting model. We train a state-of-the-art MT system using MERT on many parameterizations of each metric and evaluate the resulting models on the other metrics and also using human judges. In accordance with popular wisdom, we find that it’s important to train on the same metric used in testing. However, we also find that training to a newer metric is only useful to the extent that the MT model’s structure and features allow it to take advantage of the metric. Contrasting with TER’s good correlation with human judgments, we show that people tend to prefer BLEU and NIST trained models to those trained on edit distance based metrics like TER or WER. Human preferences for METEOR trained models varies depending on the source language. Since using BLEU or NIST produces models that are more robust to evaluation by other metrics and perform well in human judgments, we conclude they are still the best choice for training. 1
Combining Machine Translation Output with Open Source The Carnegie Mellon Multi-Engine Machine Translation Scheme
, 2010
"... The Carnegie Mellon multi-engine machine translation software merges output from several machine translation systems into a single improved translation. This improvement is significant: in the recent NIST MT09 evaluation, the combined Arabic-English output scored 5.22 BLEU points higher than the bes ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The Carnegie Mellon multi-engine machine translation software merges output from several machine translation systems into a single improved translation. This improvement is significant: in the recent NIST MT09 evaluation, the combined Arabic-English output scored 5.22 BLEU points higher than the best individual system. Concurrent with this paper, we release the source code behind this result consisting of a recombining beam search decoder, the combination search space and features, and several accessories. Here we describe how the released software works and its use. 1.
Candidacy Examination
"... What empirical evidence is there that adding syntactic constraints to MT decoding particular, PMT decoding will lead to improvements in translation quality? Your proposal claims that your method for adding syntactic constraints will result not only in a more complete search of the space of string pe ..."
Abstract
- Add to MetaCart
What empirical evidence is there that adding syntactic constraints to MT decoding particular, PMT decoding will lead to improvements in translation quality? Your proposal claims that your method for adding syntactic constraints will result not only in a more complete search of the space of string permutations involved in PMT but also in an improved ability to discriminate between good and bad translations. In Section 3 you claim that the ability to account for syntactically governed re-ordering patterns is an advantage and in Section 4 you claim, on the basis of a constructed example, that your proposed method will improve quality by removing ungrammatical but high scoring distractor analyses, and that the completeness of the search will be improved by reducing the need for aggressive heuristics about re-ordering. Do you anticipate that separate constraints on re-ordering will still be required? If not, say why not. If so, brie y sketch how these constraints will be implemented and the means by which they will interact with the new syntactic constraints. Statistical MT (SMT) systems are based on the source-channel model of communication (Weaver, 1949; Brown et al., 1993, 1990) whereby an output string is modelled as being

