Results 1 -
9 of
9
Sampling alignment structure under a Bayesian translation model
- In Empirical Methods in Natural Language Processing (EMNLP
, 2008
"... We describe the first tractable Gibbs sampling procedure for estimating phrase pair frequencies under a probabilistic model of phrase alignment. We propose and evaluate two nonparametric priors that successfully avoid the degenerate behavior noted in previous work, where overly large phrases memoriz ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
We describe the first tractable Gibbs sampling procedure for estimating phrase pair frequencies under a probabilistic model of phrase alignment. We propose and evaluate two nonparametric priors that successfully avoid the degenerate behavior noted in previous work, where overly large phrases memorize the training data. Phrase table weights learned under our model yield an increase in BLEU score over the word-alignment based heuristic estimates used regularly in phrasebased translation systems. 1
Using Syntax to Improve Word Alignment Precision for Syntax-Based Machine Translation
"... Word alignments that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntaxbased machine translation. We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules. We obtain gains in ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Word alignments that violate syntactic correspondences interfere with the extraction of string-to-tree transducer rules for syntaxbased machine translation. We present an algorithm for identifying and deleting incorrect word alignment links, using features of the extracted rules. We obtain gains in both alignment quality and translation quality in Chinese-English and Arabic-English translation experiments relative to a GIZA++ union baseline.
Learning for Semantic Parsing and Natural Language Generation Using Statistical Machine Translation Techniques
, 2007
"... ..."
Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation
- Proceedings of the 3rd Workshop on Building and Using Comparable Corpora. Applications of Parallel and Comparable Corpora in Natural Language Engineering and the Humanities
, 2010
"... Lack of sufficient linguistic resources and parallel corpora for many languages and domains currently is one of the major obstacles to further advancement of automated translation. The solution proposed in this paper is to exploit the fact that non-parallel bi- or multilingual text resources are muc ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Lack of sufficient linguistic resources and parallel corpora for many languages and domains currently is one of the major obstacles to further advancement of automated translation. The solution proposed in this paper is to exploit the fact that non-parallel bi- or multilingual text resources are much more widely available than parallel translation data. This position paper presents previous research in this field and research plans of the ACCURAT project. Its goal is to find, analyze and evaluate novel methods that exploit comparable corpora in order to compensate for the shortage of linguistic resources, and ultimately to significantly improve MT quality for under-resourced languages and narrow domains. 1.
Improved Tree-to-string Transducer for Machine Translation
"... We propose three enhancements to the treeto-string (TTS) transducer for machine translation: first-level expansion-based normalization for TTS templates, a syntactic alignment framework integrating the insertion of unaligned target words, and subtree-based n-gram model addressing the tree decomposit ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We propose three enhancements to the treeto-string (TTS) transducer for machine translation: first-level expansion-based normalization for TTS templates, a syntactic alignment framework integrating the insertion of unaligned target words, and subtree-based n-gram model addressing the tree decomposition probability. Empirical results show that these methods improve the performance of a TTS transducer based on the standard BLEU-4 metric. We also experiment with semantic labels in a TTS transducer, and achieve improvement over our baseline system. 1
Machine Translation Using Automatically Inferred Construction-based Correspondence and Language Models ⋆
"... Abstract. We discuss the problem of translation in the wider context of the problem of meaning in cognition and describe a structural statistical machine translation (MT) method motivated by philosophical, cognitive, and computational considerations. Our approach relies on a recently published algor ..."
Abstract
- Add to MetaCart
Abstract. We discuss the problem of translation in the wider context of the problem of meaning in cognition and describe a structural statistical machine translation (MT) method motivated by philosophical, cognitive, and computational considerations. Our approach relies on a recently published algorithm capable of learning from a raw corpus a limited yet effective grammar that can be used to construct probabilistic parsers and language models, and on cognitively motivated heuristics for learning construction-based translation models. A pilot system has been implemented and tested successfully on simple English to Hebrew and Spanish to English translation tasks.
Can Semantic Roles Improve Syntax-Based Machine Translation?
"... This paper compares the performance of a Tree-to-string (TTS) transducer with automatically generated/gold-standard parse trees and semantic roles. Experimental results show that improving the parsing quality can lead to significant improvement in MT performance and adding semantic roles in the synt ..."
Abstract
- Add to MetaCart
This paper compares the performance of a Tree-to-string (TTS) transducer with automatically generated/gold-standard parse trees and semantic roles. Experimental results show that improving the parsing quality can lead to significant improvement in MT performance and adding semantic roles in the syntax tree labels does not improve the TTS transducer. Another approach of using semantic roles: skeleton template extraction, is proposed and shown to be better than extracting straight long templates down to a certain depth. 1
Two Methods for Extending Hierarchical Rules from the Bilingual Chart Parsing
"... This paper studies two methods for training hierarchical MT rules independently of word alignments. Bilingual chart parsing and EM algorithm are used to train bitext correspondences. The first method, rule arithmetic, constructs new rules as combinations of existing and reliable rules used in the bi ..."
Abstract
- Add to MetaCart
This paper studies two methods for training hierarchical MT rules independently of word alignments. Bilingual chart parsing and EM algorithm are used to train bitext correspondences. The first method, rule arithmetic, constructs new rules as combinations of existing and reliable rules used in the bilingual chart, significantly improving the translation accuracy on the German-English and Farsi-English translation task. The second method is proposed to construct additional rules directly from the chart using inside and outside probabilities to determine the span of the rule and its non-terminals. The paper also presents evidence that the rule arithmetic can recover from alignment errors, and that it can learn rules that are difficult to learn from bilingual alignments. 1

