Results 1 - 10
of
27
Online Large-Margin Training of Syntactic and Structural Translation Features
"... Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative ..."
Abstract
-
Cited by 124 (12 self)
- Add to MetaCart
(Show Context)
Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative to MERT. We first show that by parallel processing and exploiting more of the parse forest, we can obtain results using MIRA that match or surpass MERT in terms of both translation quality and computational cost. We then test the method on two classes of features that address deficiencies in the Hiero hierarchical phrasebased model: first, we simultaneously train a large number of Marton and Resnik’s soft syntactic constraints, and, second, we introduce a novel structural distortion model. In both cases we obtain significant improvements in translation performance. Optimizing them in combination, for a total of 56 feature weights, we improve performance by 2.6 Bleu on a subset of the NIST 2006 Arabic-English evaluation data.
Learning Translation Boundaries for Phrase-Based Decoding
"... Constrained decoding is of great importance not only for speed but also for translation quality. Previous efforts explore soft syntactic constraints which are based on constituent boundaries deduced from parse trees of the source language. We present a new framework to establish soft constraints bas ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
(Show Context)
Constrained decoding is of great importance not only for speed but also for translation quality. Previous efforts explore soft syntactic constraints which are based on constituent boundaries deduced from parse trees of the source language. We present a new framework to establish soft constraints based on a more natural alternative: translation boundary rather than constituent boundary. We propose simple classifiers to learn translation boundaries for any source sentences. The classifiers are trained directly on word-aligned corpus without using any additional resources. We report the accuracy of our translation boundary classifiers. We show that using constraints based on translation boundaries predicted by our classifiers achieves significant improvements over the baseline on large-scale Chinese-to-English translation experiments. The new constraints also significantly outperform constituent boundary based syntactic constrains. 1
Syntactic Reordering in Preprocessing for Japanese→English Translation: MIT System Description for NTCIR-7 Patent Translation Task
- In Proceedings of NTCIR-7 Workshop Meeting
, 2008
"... We experimented with a well-known technique of training a JapaneseEnglish translation system on a Japanese training corpus that has been reordered into an English-like word order. We achieved sur-prisingly impressive results by naively reordering each Japanese sentence into reverse order. We also de ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
We experimented with a well-known technique of training a JapaneseEnglish translation system on a Japanese training corpus that has been reordered into an English-like word order. We achieved sur-prisingly impressive results by naively reordering each Japanese sentence into reverse order. We also de-veloped a reordering algorithm that transforms a Japanese dependency parse into English word order.
Non-Projective Parsing for Statistical Machine Translation
"... We describe a novel approach for syntaxbased statistical MT, which builds on a variant of tree adjoining grammar (TAG). Inspired by work in discriminative dependency parsing, the key idea in our approach is to allow highly flexible reordering operations during parsing, in combination with a discrimi ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
We describe a novel approach for syntaxbased statistical MT, which builds on a variant of tree adjoining grammar (TAG). Inspired by work in discriminative dependency parsing, the key idea in our approach is to allow highly flexible reordering operations during parsing, in combination with a discriminative model that can condition on rich features of the sourcelanguage string. Experiments on translation from German to English show improvements over phrase-based systems, both in terms of BLEU scores and in human evaluations. 1
Soft dependency constraints for reordering in hierarchical phrase-based translation
- In Proceedings of the Conference on Empirical Methods in Natural Language Processing
, 2011
"... Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source de-pendency parsing to directly address the re-ordering problem for the hierarchical phrase-based model. Our approach significantly im-proves Chinese–English machine ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source de-pendency parsing to directly address the re-ordering problem for the hierarchical phrase-based model. Our approach significantly im-proves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On aver-age our approach improves reordering preci-sion and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially ef-fective for long-distance reodering. 1
A Source-side Decoding Sequence Model for Statistical Machine Translation
- In Proceedings of the Conference of the Association for Machine Translation
, 2010
"... We propose a source-side decoding sequence language model for phrase-based statistical machine translation. This model is a reordering model in the sense that it helps the decoder find the correct decoding sequence. The model uses word-aligned bilingual training data. We show improved translation qu ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
(Show Context)
We propose a source-side decoding sequence language model for phrase-based statistical machine translation. This model is a reordering model in the sense that it helps the decoder find the correct decoding sequence. The model uses word-aligned bilingual training data. We show improved translation quality of up to 1.34 % BLEU and 0.54 % TER using this model compared to three other widely used reordering models. 1
A Syntax-Driven Bracketing Model for Phrase-Based Translation
- In Proc. ACL
, 2009
"... Syntactic analysis influences the way in which the source sentence is translated. Previous efforts add syntactic constraints to phrase-based translation by directly rewarding/punishing a hypothesis when-ever it matches/violates source-side con-stituents. We present a new model that automatically lea ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
Syntactic analysis influences the way in which the source sentence is translated. Previous efforts add syntactic constraints to phrase-based translation by directly rewarding/punishing a hypothesis when-ever it matches/violates source-side con-stituents. We present a new model that automatically learns syntactic constraints, including but not limited to constituent matching/violation, from training corpus. The model brackets a source phrase as to whether it satisfies the learnt syntac-tic constraints. The bracketed phrases are then translated as a whole unit by the de-coder. Experimental results and analy-sis show that the new model outperforms other previous methods and achieves a substantial improvement over the baseline which is not syntactically informed. 1
Modeling Syntactic and Semantic Structures in Hierarchical Phrase-based Translation
"... Incorporating semantic structure into a linguistics-free translation model is challenging, since semantic structures are closely tied to syntax. In this paper, we propose a two-level approach to exploiting predicate-argument structure reordering in a hierarchical phrase-based translation model. Firs ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Incorporating semantic structure into a linguistics-free translation model is challenging, since semantic structures are closely tied to syntax. In this paper, we propose a two-level approach to exploiting predicate-argument structure reordering in a hierarchical phrase-based translation model. First, we introduce linguistically motivated constraints into a hierarchical model, guiding translation phrase choices in favor of those that respect syntactic boundaries. Second, based on such translation phrases, we propose a predicate-argument structure reordering model that predicts reordering not only between an argument and its predicate, but also between two arguments. Experiments on Chinese-to-English translation demonstrate that both advances significantly improve translation accuracy. 1
A framework for effectively integrating hard and soft syntactic rules into phrase based translation
- Hong Kong. City University of Hong Kong
, 2009
"... Abstract. In adding syntactic knowledge into phrase-based translation, using hard or soft syntactic rules to reorder the source-language aiming to closely approximate the targetlanguage word order has been successful in improving translation quality. However, it suffers from propagating the pre-reor ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. In adding syntactic knowledge into phrase-based translation, using hard or soft syntactic rules to reorder the source-language aiming to closely approximate the targetlanguage word order has been successful in improving translation quality. However, it suffers from propagating the pre-reordering errors to the later translation step (decoding). In this paper, we propose a novel framework to integrate hard and soft syntactic rules into phrase-based translation more effectively. For a source sentence to be translated, hard or soft syntactic rules are first acquired from the source parse tree prior to translation, and then instead of reordering the source sentence directly, the rules are used as a strong feature integrated into our elaborately designed model to help phrase reordering in the decoding stage. The experiments on NIST Chinese-to-English translation show that our approach, whether incorporating hard or soft rules, significantly outperforms the previous methods.
Hierarchical Chunk-to-String Translation ∗
"... We present a hierarchical chunk-to-string translation model, which can be seen as a compromise between the hierarchical phrasebased model and the tree-to-string model, to combine the merits of the two models. With the help of shallow parsing, our model learns rules consisting of words and chunks and ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
We present a hierarchical chunk-to-string translation model, which can be seen as a compromise between the hierarchical phrasebased model and the tree-to-string model, to combine the merits of the two models. With the help of shallow parsing, our model learns rules consisting of words and chunks and meanwhile introduce syntax cohesion. Under the weighed synchronous context-free grammar defined by these rules, our model searches for the best translation derivation and yields target translation simultaneously. Our experiments show that our model significantly outperforms the hierarchical phrasebased model and the tree-to-string model on English-Chinese Translation tasks. 1