Results 1 - 10
of
18
A simple and effective hierarchical phrase reordering model
- In Proceedings of EMNLP 2008
, 2008
"... While phrase-based statistical machine translation systems currently deliver state-of-theart performance, they remain weak on word order changes. Current phrase reordering models can properly handle swaps between adjacent phrases, but they typically lack the ability to perform the kind of long-dista ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
While phrase-based statistical machine translation systems currently deliver state-of-theart performance, they remain weak on word order changes. Current phrase reordering models can properly handle swaps between adjacent phrases, but they typically lack the ability to perform the kind of long-distance reorderings possible with syntax-based systems. In this paper, we present a novel hierarchical phrase reordering model aimed at improving non-local reorderings, which seamlessly integrates with a standard phrase-based system with little loss of computational efficiency. We show that this model can successfully handle the key examples often used to motivate syntax-based systems, such as the rotation of a prepositional phrase around a noun phrase. We contrast our model with reordering models commonly used in phrase-based systems, and show that our approach provides statistically significant BLEU point gains for two language pairs: Chinese-English (+0.53 on MT05 and +0.71 on MT08) and Arabic-English (+0.55 on MT05). 1
Why Synchronous Tree Substitution Grammars?
"... Synchronous tree substitution grammars are a translation model that is used in syntax-based machine translation. They are investigated in a formal setting and compared to a competitor that is at least as expressive. The competitor is the extended multi bottom-up tree transducer, which is the bottom- ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Synchronous tree substitution grammars are a translation model that is used in syntax-based machine translation. They are investigated in a formal setting and compared to a competitor that is at least as expressive. The competitor is the extended multi bottom-up tree transducer, which is the bottom-up analogue with one essential additional feature. This model has been investigated in theoretical computer science, but seems widely unknown in natural language processing. The two models are compared with respect to standard algorithms (binarization, regular restriction, composition, application). Particular attention is paid to the complexity of the algorithms. 1
Joint Parsing and Translation
"... Tree-based translation models, which exploit the linguistic syntax of source language, usually separate decoding into two steps: parsing and translation. Although this separation makes tree-based decoding simple and efficient, its translation performance is usually limited by the number of parse tre ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Tree-based translation models, which exploit the linguistic syntax of source language, usually separate decoding into two steps: parsing and translation. Although this separation makes tree-based decoding simple and efficient, its translation performance is usually limited by the number of parse trees offered by parser. Alternatively, we propose to parse and translate jointly by casting tree-based translation as parsing. Given a source-language sentence, our joint decoder produces a parse tree on the source side and a translation on the target side simultaneously. By combining translation and parsing models in a discriminative framework, our approach significantly outperforms a forestbased tree-to-string system by 1.1 absolute BLEU points on the NIST 2005 Chinese-English test set. As a parser, our joint decoder achieves an F1 score of 80.6 % on the Penn Chinese Treebank. 1
Asynchronous Binarization for Synchronous Grammars
"... Binarization of n-ary rules is critical for the efficiency of syntactic machine translation decoding. Because the target side of a rule will generally reorder the source side, it is complex (and sometimes impossible) to find synchronous rule binarizations. However, we show that synchronous binarizat ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Binarization of n-ary rules is critical for the efficiency of syntactic machine translation decoding. Because the target side of a rule will generally reorder the source side, it is complex (and sometimes impossible) to find synchronous rule binarizations. However, we show that synchronous binarizations are not necessary in a two-stage decoder. Instead, the grammar can be binarized one way for the parsing stage, then rebinarized in a different way for the reranking stage. Each individual binarization considers only one monolingual projection of the grammar, entirely avoiding the constraints of synchronous binarization and allowing binarizations that are separately optimized for each stage. Compared to n-ary forest reranking, even simple target-side binarization schemes improve overall decoding accuracy. 1
Two monolingual parses are better than one (synchronous parse
- In Proc. of HLT-NAACL
, 2010
"... We describe a synchronous parsing algorithm that is based on two successive monolingual parses of an input sentence pair. Although the worst-case complexity of this algorithm is and must be O(n6) for binary SCFGs, its average-case run-time is far better. We demonstrate that for a number of common sy ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We describe a synchronous parsing algorithm that is based on two successive monolingual parses of an input sentence pair. Although the worst-case complexity of this algorithm is and must be O(n6) for binary SCFGs, its average-case run-time is far better. We demonstrate that for a number of common synchronous parsing problems, the two-parse algorithm substantially outperforms alternative synchronous parsing strategies, making it efficient enough to be utilized without resorting to a pruned search. 1
On the Expressivity of Linear Transductions
"... We investigate the formal expressivity properties of linear transductions, the class of transductions generated by linear transduction grammars, linear inversion transduction grammars and preterminalized linear inversion transduction grammars. While empirical results such as those in previous work a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We investigate the formal expressivity properties of linear transductions, the class of transductions generated by linear transduction grammars, linear inversion transduction grammars and preterminalized linear inversion transduction grammars. While empirical results such as those in previous work are of course an ultimate test of modeling adequacy for machine translation applications, it is equally important to understand the formal theoretical properties of any such new representation. An important part of the expressivity of a transduction is the possibility to align tokens between the two languages generated. We refer to the number of different alignments that are allowed under a transduction as its weak alignment capacity. This aspect of expressivity is quantified for linear transductions using preterminalized linear inversion transduction grammars, and compared to the expressivity of finite-state transductions, inversion transductions and syntax-directed transductions. 1
Phrase Translation Probabilities with ITG Priors and Smoothing as Learning Objective
"... The conditional phrase translation probabilities constitute the principal components of phrase-based machine translation systems. These probabilities are estimated using a heuristic method that does not seem to optimize any reasonable objective function of the word-aligned, parallel training corpus. ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The conditional phrase translation probabilities constitute the principal components of phrase-based machine translation systems. These probabilities are estimated using a heuristic method that does not seem to optimize any reasonable objective function of the word-aligned, parallel training corpus. Earlier efforts on devising a better understood estimator either do not scale to reasonably sized training data, or lead to deteriorating performance. In this paper we explore a new approach based on three ingredients (1) A generative model with a prior over latent segmentations derived from Inversion Transduction Grammar (ITG), (2) A phrase table containing all phrase pairs without length limit, and (3) Smoothing as learning objective using a novel Maximum-A-Posteriori version of Deleted Estimation working with Expectation-Maximization. Where others conclude that latent segmentations lead to overfitting and deteriorating performance, we show here that these three ingredients give performance equivalent to the heuristic method on reasonably sized training data.
Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
"... We study the problem of finding the best headdriven parsing strategy for Linear Context-Free Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We study the problem of finding the best headdriven parsing strategy for Linear Context-Free Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing. 1
A Systematic Comparison between Inversion Transduction Grammar and Linear Transduction Grammar for Word Alignment
"... We present two contributions to grammar driven translation. First, since both Inversion Transduction Grammar and Linear Inversion Transduction Grammars have been shown to produce better alignments then the standard word alignment tool, we investigate how the trade-off between speed and end-to-end tr ..."
Abstract
- Add to MetaCart
We present two contributions to grammar driven translation. First, since both Inversion Transduction Grammar and Linear Inversion Transduction Grammars have been shown to produce better alignments then the standard word alignment tool, we investigate how the trade-off between speed and end-to-end translation quality extends to the choice of grammar formalism. Second, we prove that Linear Transduction Grammars (LTGs) generate the same transductions as Linear Inversion Transduction Grammars, and present a scheme for arriving at LTGs by bilingualizing Linear Grammars. We also present a method for obtaining Inversion Transduction Grammars from Linear (Inversion) Transduction Grammars, which can speed up grammar induction from parallel corpora dramatically. 1
An Alternative to Synchronous Tree Substitution Grammars †
, 2010
"... Synchronous tree substitution grammars (stsg) are a (formal) tree transformation model that is used in the area of syntax-based machine translation. A competitor that is at least as expressive as stsg is proposed and compared to stsg. The competitor is the extended multi bottom-up tree transducer (m ..."
Abstract
- Add to MetaCart
Synchronous tree substitution grammars (stsg) are a (formal) tree transformation model that is used in the area of syntax-based machine translation. A competitor that is at least as expressive as stsg is proposed and compared to stsg. The competitor is the extended multi bottom-up tree transducer (mbot), which is the bottom-up analogue with the additional feature that states have non-unary ranks. Unweighted mbot have already been investigated with respect to their basic properties, but the particular properties of the constructions that are required in the machine translation task are largely unknown. stsg and mbot are compared with respect to binarization, regular restriction, and application. Particular attention is paid to the complexity of the constructions. 1

