Lattice Rescoring Methods for Statistical Machine Translation
BibTeX
@MISC{Blackwood_latticerescoring,
author = {Graeme Blackwood},
title = {Lattice Rescoring Methods for Statistical Machine Translation},
year = {}
}
OpenURL
Abstract
This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings (Blackwood et al., 2008a; Blackwood et al., 2008b; Blackwood et al., 2009; Kurimo et al. (2009)) and a journal article (de Gispert et al., 2010), or accepted for publication in forthcoming conference proceedings (Blackwood and Byrne, 2010). The length of this thesis including appendices, references, footnotes, tables and equations is approximately 53,000 words and it contains 56 figures and 58 tables. i Summary Modern statistical machine translation (SMT) systems include multiple interrelated components, statistical models, and processes. Translation is often factored as a cascaded series of modules such that the output of one module serves as the input to the next; this is the SMT pipeline. Simplifying assumptions, limited training data, and pruning during search mean that the maximum likelihood hypothesis may not represent the best translation. Since any errors will be propagated through the SMT pipeline, it is better to avoid hard decisions by







