Results 1 - 10
of
48
Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation
, 2011
"... We describe an exact decoding algorithm for syntax-based statistical translation. The approach uses Lagrangian relaxation to decompose the decoding problem into tractable subproblems, thereby avoiding exhaustive dynamic programming. The method recovers exact solutions, with certificates of optimalit ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
We describe an exact decoding algorithm for syntax-based statistical translation. The approach uses Lagrangian relaxation to decompose the decoding problem into tractable subproblems, thereby avoiding exhaustive dynamic programming. The method recovers exact solutions, with certificates of optimality, on over 97 % of test examples; it has comparable speed to state-of-the-art decoders.
Hierarchical Phrase-Based Translation Representations
"... This paper compares several translation representations for a synchronous context-free grammar parse including CFGs/hypergraphs, finite-state automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortest-pa ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
This paper compares several translation representations for a synchronous context-free grammar parse including CFGs/hypergraphs, finite-state automata (FSA), and pushdown automata (PDA). The representation choice is shown to determine the form and complexity of target LM intersection and shortest-path algorithms that follow. Intersection, shortest path, FSA expansion and RTN replacement algorithms are presented for PDAs. Chinese-to-English translation experiments using HiFST and HiPDT, FSA and PDA-based decoders, are presented using admissible (or exact) search, possible for HiFST with compact SCFG rulesets and HiPDT with compact LMs. For large rulesets with large LMs, we introduce a two-pass search strategy which we then analyze in terms of search errors and translation performance. 1
Context-free reordering, finitestate translation
- In Proc. of HLT-NAACL
, 2010
"... We describe a class of translation model in which a set of input variants encoded as a context-free forest is translated using a finite-state translation model. The forest structure of the input is well-suited to representing word order alternatives, making it straightforward to model translation as ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
We describe a class of translation model in which a set of input variants encoded as a context-free forest is translated using a finite-state translation model. The forest structure of the input is well-suited to representing word order alternatives, making it straightforward to model translation as a two step process: (1) tree-based source reordering and (2) phrase transduction. By treating the reordering pro-cess as a latent variable in a probabilistic trans-lation model, we can learn a long-range source reordering model without example reordered sentences, which are problematic to construct. The resulting model has state-of-the-art trans-lation performance, uses linguistically moti-vated features to effectively model long range reordering, and is significantly smaller than a comparable hierarchical phrase-based transla-tion model. 1
Kriya – An end-to-end Hierarchical Phrase-based MT System
, 2012
"... This paper describes Kriya — a new statistical machine translation (SMT) system that uses hierarchical phrases, which were first introduced in the Hiero machine translation system (Chiang, 2007). Kriya supports both a grammar extraction module for synchronous context-free grammars (SCFGs) and a CKY- ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
This paper describes Kriya — a new statistical machine translation (SMT) system that uses hierarchical phrases, which were first introduced in the Hiero machine translation system (Chiang, 2007). Kriya supports both a grammar extraction module for synchronous context-free grammars (SCFGs) and a CKY-based decoder. There are several re-implementations of Hiero in the machine translation community, but Kriya offers the following novel contributions: (a) Grammar extraction in Kriya supports extraction of the full set of Hiero-style SCFG rules but also supports the extraction of several types of compact rule sets which leads to faster decoding for different language pairs without compromising the BLEU scores. Kriya currently supports extraction of compact SCFGs such as grammars with one non-terminal and grammar pruning based on certain rule patterns, and (b) The Kriya decoder offers some unique improvements in the implementation of cube pruning, such as increasing diversity in the target language n-best output and novel methods for language model (LM) integration. The Kriya decoder can take advantage of parallelization using a networked cluster. Kriya supports KENLM and SRILM for language model queries and exploits n-gram history states in KENLM. This paper also provides several experimental results which demonstrate that the translation quality of Kriya compares favourably to the Moses (Koehn et al., 2007) phrase-based system in several language pairs while showing a substantial improvement for Chinese-English similar to Chiang (2007). We also quantify the model sizes for phrase-based and Hiero-style systems apart from presenting experiments comparing variants of Hiero models. 1.
The RWTH Aachen Machine Translation System for WMT 2010
- In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
, 2010
"... In this paper the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2011 is presented. We participated in the MT (English-French, Arabic-English, Chinese-English) and SLT ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
(Show Context)
In this paper the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2011 is presented. We participated in the MT (English-French, Arabic-English, Chinese-English) and SLT (English-French) tracks. Both hierarchical and phrase-based SMT decoders are applied. A number of different techniques are evaluated, including domain adaptation via monolingual and bilingual data selection, phrase training, different lexical smoothing methods, additional reordering models for the hierarchical system, various Arabic and Chinese segmentation methods, punctuation prediction for speech recognition output, and system combination. By application of these methods we can show considerable improvements over the respective baseline systems. 1.
Discriminative reordering extensions for hierarchical phrase-based machine translation
- In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT
, 2012
"... In this paper, we propose novel extensions of hierarchical phrase-based systems with a discriminative lexicalized reordering model. We compare different feature sets for the discriminative reordering model and investigate combinations with three types of non-lexicalized reordering rules which are ad ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
In this paper, we propose novel extensions of hierarchical phrase-based systems with a discriminative lexicalized reordering model. We compare different feature sets for the discriminative reordering model and investigate combinations with three types of non-lexicalized reordering rules which are added to the hierarchical grammar in order to allow for more reordering flexibility during decoding. All extensions are evaluated in standard hierarchical setups as well as in setups where the hierarchical recursion depth is restricted. We achieve improvements of up to +1.2 %BLEU on a large-scale Chinese→English translation task. 1
Context-dependent alignment models for Statistical Machine Translation
- In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
, 2009
"... We introduce alignment models for Machine Translation that take into account the context of a source word when determining its translation. Since the use of these contexts alone causes data sparsity problems, we develop a decision tree algorithm for clustering the contexts based on optimisation of t ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We introduce alignment models for Machine Translation that take into account the context of a source word when determining its translation. Since the use of these contexts alone causes data sparsity problems, we develop a decision tree algorithm for clustering the contexts based on optimisation of the EM auxiliary function. We show that our contextdependent models lead to an improvement in alignment quality, and an increase in translation quality when the alignments are used in Arabic-English and Chinese-English translation. 1
Hierarchical Phrase-based Translation Grammars Extracted from Alignment Posterior Probabilities
"... We report on investigations into hierarchical phrase-based translation grammars based on rules extracted from posterior distributions over alignments of the parallel text. Rather than restrict rule extraction to a single alignment, such as Viterbi, we instead extract rules based on posterior distrib ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We report on investigations into hierarchical phrase-based translation grammars based on rules extracted from posterior distributions over alignments of the parallel text. Rather than restrict rule extraction to a single alignment, such as Viterbi, we instead extract rules based on posterior distributions provided by the HMM word-to-word alignment model. We define translation grammars progressively by adding classes of rules to a basic phrase-based system. We assess these grammars in terms of their expressive power, measured by their ability to align the parallel text from which their rules are extracted, and the quality of the translations they yield. In Chinese-to-English translation, we find that rule extraction from posteriors gives translation improvements. We also find that grammars with rules with only one nonterminal, when extracted from posteriors, can outperform more complex grammars extracted from Viterbi alignments. Finally, we show that the best way to exploit source-totarget and target-to-source alignment models is to build two separate systems and combine their output translation lattices. 1
Efficient path counting transducers for minimum Bayes-risk decoding of statistical machine translation lattices
, 2010
"... This paper presents an efficient implementation of linearised lattice minimum Bayes-risk decoding using weighted finite state transducers. We introduce transducers to efficiently count lattice paths containing n-grams and use these to gather the required statistics. We show that these procedures can ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
This paper presents an efficient implementation of linearised lattice minimum Bayes-risk decoding using weighted finite state transducers. We introduce transducers to efficiently count lattice paths containing n-grams and use these to gather the required statistics. We show that these procedures can be implemented exactly through simple transformations of word sequences to sequences of n-grams. This yields a novel implementation of lattice minimum Bayes-risk decoding which is fast and exact even for very large lattices. 1
Two monolingual parses are better than one (synchronous parse
- In Proc. of HLT-NAACL
, 2010
"... We describe a synchronous parsing algorithm that is based on two successive monolingual parses of an input sentence pair. Although the worst-case complexity of this algorithm is and must be O(n6) for binary SCFGs, its average-case run-time is far better. We demonstrate that for a number of common sy ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We describe a synchronous parsing algorithm that is based on two successive monolingual parses of an input sentence pair. Although the worst-case complexity of this algorithm is and must be O(n6) for binary SCFGs, its average-case run-time is far better. We demonstrate that for a number of common synchronous parsing problems, the two-parse algorithm substantially outperforms alternative synchronous parsing strategies, making it efficient enough to be utilized without resorting to a pruned search. 1