## Assessing Phrase-Based Translation Models with Oracle Decoding

Citations: | 2 - 1 self |

### BibTeX

@MISC{Wisniewski_assessingphrase-based,

author = {Guillaume Wisniewski and Re Allauzen},

title = {Assessing Phrase-Based Translation Models with Oracle Decoding},

year = {}

}

### OpenURL

### Abstract

Extant Statistical Machine Translation (SMT) systems are very complex softwares, which embed multiple layers of heuristics and embark very large numbers of numerical parameters. As a result, it is difficult to analyze output translations and there is a real need for tools that could help developers to better understand the various causes of errors. In this study, we make a step in that direction and present an attempt to evaluate the quality of the phrase-based translation model. In order to identify those translation errors that stem from deficiencies in the phrase table (PT), we propose to compute the oracle BLEU-4 score, that is the best score that a system based on this PT can achieve on a reference corpus. By casting the computation of the oracle BLEU-1 as an Integer Linear Programming (ILP) problem, we show that it is possible to efficiently compute accurate lower-bounds of this score, and report measures performed on several standard benchmarks. Various other applications of these oracle decoding techniques are also reported and discussed.

### Citations

1477 | Bleu: a method for automatic evaluation of machine translation
- Papineni, Roukos, et al.
- 2001
(Show Context)
Citation Context ... achievable performances of a PBTS. We aim at both studying the expressive power of PBTS and at providing tools for identifying and quantifying causes of failure. Under standard metrics such as BLEU (=-=Papineni et al., 2002-=-), oracle scores are difficult (if not impossible) to compute, but, by casting the computation of the oracle unigram recall and precision as an Integer Linear Programming (ILP) problem, we show that i... |

1255 | 2003. A Systematic Comparison of Various Statistical Alignment Models
- Och, Ney
(Show Context)
Citation Context ...e best one has the highest score. A PBTS is learned from a parallel corpus in two independent steps. In a first step, the corpus is aligned at the word level, by using alignment tools such as Giza++ (=-=Och and Ney, 2003-=-) and some symmetrisation heuristics; phrases are then extracted by other heuristics (Koehn et al., 2003) and assigned numerical weights. In the second step, the parameters of the scoring function are... |

883 | Moses: Open source toolkit for statistical machine translation
- Koehn, Hoang, et al.
- 2007
(Show Context)
Citation Context ...ing (Koehn, 2004). Moreover, to reduce the overall complexity of decoding, the search space is typically pruned using simple heuristics. For instance, the state-of-the-art phrase-based decoder Moses (=-=Koehn et al., 2007-=-) considers only a restricted number of translations for each source sequence2 and enforces a distortion limit3 over which phrases can be reordered. As a consequence, the best translation hypothesis r... |

635 | Statistical phrase-based translation
- Koehn, Och, et al.
- 2003
(Show Context)
Citation Context ... a first step, the corpus is aligned at the word level, by using alignment tools such as Giza++ (Och and Ney, 2003) and some symmetrisation heuristics; phrases are then extracted by other heuristics (=-=Koehn et al., 2003-=-) and assigned numerical weights. In the second step, the parameters of the scoring function are estimated, typically through Minimum Error Rate training (Och, 2003). Translating a sentence amounts to... |

133 | The tradeoffs of large scale learning - Bottou, Bousquet - 2008 |

120 | An end-to-end discriminative approach to machine translation
- Liang, Bouchard, et al.
- 2006
(Show Context)
Citation Context ...ed to assess the expressive power of phrase-based systems (Auli et al., 2009). Other applications include computing acceptable pseudo-references for discriminative training (Tillmann and Zhang, 2006; =-=Liang et al., 2006-=-; Arun and 5 The oracle decoding problem can be extended to the case of multiple references. For the sake of simplicity, we only describe the case of a single reference. 934Koehn, 2007) or combining ... |

113 | Fast Decoding and Optimal Decoding for Machine Translation
- Germann, Jahr, et al.
- 2001
(Show Context)
Citation Context ...section, we propose to cast it into an Integer Linear Programming (ILP) problem, for which many generic solvers exist. ILP has already been used in SMT to find the optimal translation for word-based (=-=Germann et al., 2001-=-) and to study the complexity of learning phrase alignments (De Nero and Klein, 2008) models. Following the latter reference, we introduce the following variables: fi,j (resp. ek,l) is a binary indica... |

69 | Integer linear programming inference for conditional random fields
- Roth, Yih
- 2005
(Show Context)
Citation Context ...ore than d source words that were skipped. Note that the constraints introduced above are not all linear in the problem variables; however they can easily be linearized using standard ILP techniques (=-=Roth and Yih, 2005-=-). 3 Oracle Decoding for Failure Analysis 3.1 Experimental Setting We propose to use our oracle decoder to study the ability of a PBTS to translate from English to French and from German to English. T... |

36 | A discriminative global training algorithm for statistical MT
- Tillmann
(Show Context)
Citation Context ...ing problem can also be used to assess the expressive power of phrase-based systems (Auli et al., 2009). Other applications include computing acceptable pseudo-references for discriminative training (=-=Tillmann and Zhang, 2006-=-; Liang et al., 2006; Arun and 5 The oracle decoding problem can be extended to the case of multiple references. For the sake of simplicity, we only describe the case of a single reference. 934Koehn,... |

27 | The complexity of phrase alignment problems - DeNero, Klein - 2008 |

25 | Exploiting source similarity for SMT using context-informed features - Stroppa, Bosch, et al. - 2007 |

22 | Online learning methods for discriminative training of phrase based statistical machine translation - Arun, Koehn - 2007 |

12 | Rich source-side context for statistical machine translation
- Gimpel, Smith
- 2008
(Show Context)
Citation Context ...ive to improve on the way phrases and hypotheses are scored during training. This gives further support to attempts aimed at designing context-dependent scoring functions as in (Stroppa et al., 2007; =-=Gimpel and Smith, 2008-=-), or at attempts to perform discriminative training of feature-rich models. (Bangalore et al., 2007). We have shown that the examination of difficult-totranslate sentences was an effective way to det... |

10 | A systematic analysis of translation model search spaces
- Auli, Lopez, et al.
- 2009
(Show Context)
Citation Context ...are combined, the size of these models, and the high complexity of the various decision making processes. For a SMT system, three different kinds of errors can be distinguished (Germann et al., 2004; =-=Auli et al., 2009-=-): search errors, induction errors and model errors. The former corresponds to cases where the hypothesis with the best score is missed by the search procedure, either because of the use of an ap2 the... |

7 | Fast and optimal decoding for machine translation
- Germann, Jahr, et al.
- 2004
(Show Context)
Citation Context ...number of models that are combined, the size of these models, and the high complexity of the various decision making processes. For a SMT system, three different kinds of errors can be distinguished (=-=Germann et al., 2004-=-; Auli et al., 2009): search errors, induction errors and model errors. The former corresponds to cases where the hypothesis with the best score is missed by the search procedure, either because of th... |

5 | Hwee Tou Ng, 2008. Decomposability of translation metrics for improved evaluation and efficient algorithms - Chiang, DeNeefe, et al. - 2008 |

4 | machine translation through global lexical selection and sentence reconstruction - Statistical |

3 | Tijl De Bie, and Nello Cristianini. 2008. Learning performance of a machine translation system: a statistical and computational analysis - Turchi |

2 | Complexity of finding the BLEU-optimal hypothesis in a confusion network
- Leusch, Matusov, et al.
- 2008
(Show Context)
Citation Context ...as already been considered in the case of word-based models, in which all translation units are bound to contain only one word. The problem can then be solved by a bipartite graph matching algorithm (=-=Leusch et al., 2008-=-): given a n×m binary matrix describing possible translation links between source words and target words7 , this algorithm finds the subset of links maximizing the number of words of the reference tha... |