Results 1 -
8 of
8
Soft Syntactic Constraints for Hierarchical Phrased-Based Translation
"... In adding syntax to statistical MT, there is a tradeoff between taking advantage of linguistic analysis, versus allowing the model to exploit linguistically unmotivated mappings learned from parallel training data. A number of previous efforts have tackled this tradeoff by starting with a commitment ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
In adding syntax to statistical MT, there is a tradeoff between taking advantage of linguistic analysis, versus allowing the model to exploit linguistically unmotivated mappings learned from parallel training data. A number of previous efforts have tackled this tradeoff by starting with a commitment to linguistically motivated analyses and then finding appropriate ways to soften that commitment. We present an approach that explores the tradeoff from the other direction, starting with a context-free translation model learned directly from aligned parallel text, and then adding soft constituent-level constraints based on parses of the source language. We obtain substantial improvements in performance for translation from Chinese and Arabic to English. 1
A tree sequence alignment-based tree-to-tree translation model
- In Proceedings of ACL
, 2008
"... This paper presents a translation model that is based on tree sequence alignment, where a tree sequence refers to a single sequence of subtrees that covers a phrase. The model leverages on the strengths of both phrase-based and linguistically syntax-based method. It automatically learns aligned tree ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
This paper presents a translation model that is based on tree sequence alignment, where a tree sequence refers to a single sequence of subtrees that covers a phrase. The model leverages on the strengths of both phrase-based and linguistically syntax-based method. It automatically learns aligned tree sequence pairs with mapping probabilities from word-aligned biparsed parallel texts. Compared with previous models, it not only captures non-syntactic phrases and discontinuous phrases with linguistically structured features, but also supports multi-level structure reordering of tree typology with larger span. This gives our model stronger expressive power than other reported models. Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems. 1
The University of Maryland statistical machine translation system for the Fourth Workshop on Machine Translation
- In Proceedings of the EACL-2009 Workshop on Statistical Machine Translation
, 2009
"... This paper describes the techniques we explored to improve the translation of news text in the German-English and Hungarian-English tracks of the WMT09 shared translation task. Beginning with a convention hierarchical phrase-based system, we found benefits for using word segmentation lattices as inp ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
This paper describes the techniques we explored to improve the translation of news text in the German-English and Hungarian-English tracks of the WMT09 shared translation task. Beginning with a convention hierarchical phrase-based system, we found benefits for using word segmentation lattices as input, explicit generation of beginning and end of sentence markers, minimum Bayes risk decoding, and incorporation of a feature scoring the alignment of function words in the hypothesized translation. We also explored the use of monolingual paraphrases to improve coverage, as well as co-training to improve the quality of the segmentation lattices used, but these did not lead to improvements. 1
Discriminative Word Alignment with a Function Word Reordering Model
"... We address the modeling, parameter estimation and search challenges that arise from the introduction of reordering models that capture non-local reordering in alignment modeling. In particular, we introduce several reordering models that utilize (pairs of) function words as contexts for alignment re ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We address the modeling, parameter estimation and search challenges that arise from the introduction of reordering models that capture non-local reordering in alignment modeling. In particular, we introduce several reordering models that utilize (pairs of) function words as contexts for alignment reordering. To address the parameter estimation challenge, we propose to estimate these reordering models from a relatively small amount of manuallyaligned corpora. To address the search challenge, we devise an iterative local search algorithm that stochastically explores reordering possibilities. By capturing non-local reordering phenomena, our proposed alignment model bears a closer resemblance to stateof-the-art translation model. Empirical results show significant improvements in alignment quality as well as in translation performance over baselines in a large-scale Chinese-English translation task. 1
Linguistically Annotated BTG for Statistical Machine Translation
"... Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into statistical machine translation (SMT). In this paper, we propose a Linguistically Annotated BTG (LABTG) for SMT. It conveys linguistic knowledge of source-side syntax structures t ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into statistical machine translation (SMT). In this paper, we propose a Linguistically Annotated BTG (LABTG) for SMT. It conveys linguistic knowledge of source-side syntax structures to BTG hierarchical structures through linguistic annotation. From the linguistically annotated data, we learn annotated BTG rules and train linguistically motivated phrase translation model and reordering model. We also present an annotation algorithm that captures syntactic information for BTG nodes. The experiments show that the LABTG approach significantly outperforms a baseline BTGbased system and a state-of-the-art phrasebased system on the NIST MT-05 Chineseto-English translation task. Moreover, we empirically demonstrate that the proposed method achieves better translation selection and phrase reordering. 1
Dependency-Based Bracketing Transduction Grammar for Statistical Machine Translation
"... In this paper, we propose a novel dependency-based bracketing transduction grammar for statistical machine translation, which converts a source sentence into a target dependency tree. Different from conventional bracketing transduction grammar models, we encode target dependency information into our ..."
Abstract
- Add to MetaCart
In this paper, we propose a novel dependency-based bracketing transduction grammar for statistical machine translation, which converts a source sentence into a target dependency tree. Different from conventional bracketing transduction grammar models, we encode target dependency information into our lexical rules directly, and then we employ two different maximum entropy models to determine the reordering and combination of partial dependency structures, when we merge two neighboring blocks. By incorporating dependency language model further, large-scale experiments on Chinese-English task show that our system achieves significant improvements over the baseline system on various test sets even with fewer phrases. 1
Enhancing Morphological Alignment for Translating Highly Inflected Languages ∗
"... We propose an unsupervised approach utilizing only raw corpora to enhance morphological alignment involving highly inflected languages. Our method focuses on closed-class morphemes, modeling their influence on nearby words. Our languageindependent model recovers important links missing in the IBM Mo ..."
Abstract
- Add to MetaCart
We propose an unsupervised approach utilizing only raw corpora to enhance morphological alignment involving highly inflected languages. Our method focuses on closed-class morphemes, modeling their influence on nearby words. Our languageindependent model recovers important links missing in the IBM Model 4 alignment and demonstrates improved end-toend translations for English-Finnish and English-Hungarian. 1
Generalizing Hierarchical Phrase-based Translation using Rules with Adjacent Nonterminals
"... Hierarchical phrase-based translation (Hiero, (Chiang, 2005)) provides an attractive framework within which both short- and longdistance reorderings can be addressed consistently and ef ciently. However, Hiero is generally implemented with a constraint preventing the creation of rules with adjacent ..."
Abstract
- Add to MetaCart
Hierarchical phrase-based translation (Hiero, (Chiang, 2005)) provides an attractive framework within which both short- and longdistance reorderings can be addressed consistently and ef ciently. However, Hiero is generally implemented with a constraint preventing the creation of rules with adjacent nonterminals, because such rules introduce computational and modeling challenges. We introduce methods to address these challenges, and demonstrate that rules with adjacent nonterminals can improve Hiero's generalization power and lead to signi cant performance gains in Chinese-English translation. 1

