## Exact Decoding of Phrase-based Translation Models through Lagrangian Relaxation (2011)

Venue: | In To appear proc. of EMNLP |

Citations: | 17 - 1 self |

### BibTeX

@INPROCEEDINGS{Chang11exactdecoding,

author = {Yin-wen Chang and Michael Collins},

title = {Exact Decoding of Phrase-based Translation Models through Lagrangian Relaxation},

booktitle = {In To appear proc. of EMNLP},

year = {2011}

}

### OpenURL

### Abstract

This paper describes an algorithm for exact decoding of phrase-based translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99 % of test examples. The method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes.

### Citations

1665 | BLEU: a Method for Automatic Evaluation of Machine Translation
- Papineni, Roukos, et al.
- 2001
(Show Context)
Citation Context ... errors when the beam size is 200, 1,000, or 10,000. Table 6 shows statistics for the magnitude of the search errors that MOSES-gc and MOSES-nogc make. BLEU Scores Finally, table 7 gives BLEU scores (=-=Papineni et al., 2002-=-) for decoding using MOSES and our method. The BLEU scores under the two decoders are almost identical; hence while MOSES makes a significant proportion of search errors, these search errors appear to... |

1287 | The mathematics of statistical machine translation: Parameter estimation
- Brown, Pietra, et al.
- 1993
(Show Context)
Citation Context ...ased models by Tillmann and Ney (2003) and Tillmann (2006). Several works attempt exact decoding, but efficiency remains an issue. Exact decoding via integer linear programming (ILP) for IBM model 4 (=-=Brown et al., 1993-=-) has been studied by Germann et al. (2001), with experiments using a bigram language model for sentences up to eight words in length. Riedel and Clarke (2009) have improved the efficiency of this wor... |

1013 | Moses: Open source toolkit for statistical machine translation
- Koehn, Hoang, et al.
- 2007
(Show Context)
Citation Context ...t than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (=-=Koehn et al., 2007-=-), and give precise estimates of the number and magnitude of search errors that MOSES makes. 1 Introduction Phrase-based models (Och et al., 1999; Koehn et al., 2003; Koehn et al., 2007) are a widely-... |

186 | phrase-based translation - Statistical |

129 | Fast Decoding and Optimal Decoding for Machine Translation - Germann, Jahr, et al. - 2001 |

71 | Tightening lp relaxations for map using message passing
- Meltzer, Jaakkola, et al.
- 2008
(Show Context)
Citation Context ... case of Lagrangian relaxation, has been applied to inference problems in NLP (Koo et al., 2010; Rush et al., 2010), and also to Markov random fields (Wainwright et al., 2005; Komodakis et al., 2007; =-=Sontag et al., 2008-=-). Earlier work on belief propagation (Smith and Eisner, 2008) is closely related to dual decomposition. Recently, Rush and Collins (2011) describe a Lagrangian relaxation algorithm for decoding for s... |

64 | Dependency parsing by belief propagation
- Smith, Eisner
- 2008
(Show Context)
Citation Context ...nce problems in NLP (Koo et al., 2010; Rush et al., 2010), and also to Markov random fields (Wainwright et al., 2005; Komodakis et al., 2007; Sontag et al., 2008). Earlier work on belief propagation (=-=Smith and Eisner, 2008-=-) is closely related to dual decomposition. Recently, Rush and Collins (2011) describe a Lagrangian relaxation algorithm for decoding for syntactic translation; the algorithmic construction described ... |

61 | Dual decomposition for parsing with nonprojective head automata
- Koo, Rush, et al.
- 2010
(Show Context)
Citation Context ... for solving combinatorial optimization problems (Korte and Vygen, 2008; LemarĂ©chal, 2001). Dual decomposition, a special case of Lagrangian relaxation, has been applied to inference problems in NLP (=-=Koo et al., 2010-=-; Rush et al., 2010), and also to Markov random fields (Wainwright et al., 2005; Komodakis et al., 2007; Sontag et al., 2008). Earlier work on belief propagation (Smith and Eisner, 2008) is closely re... |

10 | Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation - Rush, Collins - 2011 |

6 | Large-scale statistical machine translation with weighted finite state transducers - Blackwood, Gispert, et al. - 2008 |

4 | estimation via agreement on trees: Message-passing and linear programming - MAP |

3 | optimization via dual decomposition: Message-passing revisited - MRF |

3 |
Combinatorial Optimization: Theory and Application
- Korte, Vygen
- 2008
(Show Context)
Citation Context ...l Linguisticscovered exact solutions for the type of phrase-based models used in MOSES. 2 Related Work Lagrangian relaxation is a classical technique for solving combinatorial optimization problems (=-=Korte and Vygen, 2008-=-; LemarĂ©chal, 2001). Dual decomposition, a special case of Lagrangian relaxation, has been applied to inference problems in NLP (Koo et al., 2010; Rush et al., 2010), and also to Markov random fields ... |

3 |
Lehrstuhl Fiir Informatik
- Och, Tillmann, et al.
- 1999
(Show Context)
Citation Context ...han short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes. 1 Introduction Phrase-based models (=-=Och et al., 1999-=-; Koehn et al., 2003; Koehn et al., 2007) are a widely-used approach for statistical machine translation. The decoding problem for phrase-based models is NPhard1 ; because of this, previous work has g... |