Results 1 -
6 of
6
Effective use of linguistic and contextual information for statistical machine translation
- In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
, 2008
"... Current methods of using lexical features in machine translation have difficulty in scaling up to realistic MT tasks due to a prohibitively large number of parameters involved. In this paper, we propose methods of using new linguistic and contextual features that do not suffer from this problem and ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Current methods of using lexical features in machine translation have difficulty in scaling up to realistic MT tasks due to a prohibitively large number of parameters involved. In this paper, we propose methods of using new linguistic and contextual features that do not suffer from this problem and apply them in a state-ofthe-art hierarchical MT system. The features used in this work are non-terminal labels, non-terminal length distribution, source string context and source dependency LM scores. The effectiveness of our techniques is demonstrated by significant improvements over a strong baseline. On Arabic-to-English translation, improvements in lower-cased BLEU are
Maximum Entropy based Rule Selection Model for Syntax-based Statistical Machine Translation
"... This paper proposes a novel maximum entropy based rule selection (MERS) model for syntax-based statistical machine translation (SMT). The MERS model combines local contextual information around rules and information of sub-trees covered by variables in rules. Therefore, our model allows the decoder ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper proposes a novel maximum entropy based rule selection (MERS) model for syntax-based statistical machine translation (SMT). The MERS model combines local contextual information around rules and information of sub-trees covered by variables in rules. Therefore, our model allows the decoder to perform context-dependent rule selection during decoding. We incorporate the MERS model into a state-of-the-art linguistically syntax-based SMT model, the treeto-string alignment template model. Experiments show that our approach achieves significant improvements over the baseline system.
C&I Business Chinese Academy of Sciences
"... This paper presents a novel filtration criterion to restrict the rule extraction for the hierarchical phrase-based translation model, where a bilingual but relaxed wellformed dependency restriction is used to filter out bad rules. Furthermore, a new feature which describes the regularity that the so ..."
Abstract
- Add to MetaCart
This paper presents a novel filtration criterion to restrict the rule extraction for the hierarchical phrase-based translation model, where a bilingual but relaxed wellformed dependency restriction is used to filter out bad rules. Furthermore, a new feature which describes the regularity that the source/target dependency edge triggers the target/source word is also proposed. Experimental results show that, the new criteria weeds out about 40 % rules while with translation performance improvement, and the new feature brings another improvement to the baseline system, especially on larger corpus. 1
Maximum Entropy Based Phrase Reordering for Hierarchical Phrase-based Translation
"... Hierarchical phrase-based (HPB) translation provides a powerful mechanism to capture both short and long distance phrase reorderings. However, the phrase reorderings lack of contextual information in conventional HPB systems. This paper proposes a contextdependent phrase reordering approach that use ..."
Abstract
- Add to MetaCart
Hierarchical phrase-based (HPB) translation provides a powerful mechanism to capture both short and long distance phrase reorderings. However, the phrase reorderings lack of contextual information in conventional HPB systems. This paper proposes a contextdependent phrase reordering approach that uses the maximum entropy (MaxEnt) model to help the HPB decoder select appropriate reordering patterns. We classify translation rules into several reordering patterns, and build a MaxEnt model for each pattern based on various contextual features. We integrate the MaxEnt models into the HPB model. Experimental results show that our approach achieves significant improvements over a standard HPB system on large-scale translation tasks. On Chinese-to-English translation, the absolute improvements in BLEU (caseinsensitive) range from 1.2 to 2.1. 1
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Constraint Optimization Approach to Context Based Word Selection
"... Consistent word selection in machine translation is currently realized by resolving word sense ambiguity through the context of a single sentence or neighboring sentences. However, consistent word selection over the whole article has yet to be achieved. Consistency over the whole article is extremel ..."
Abstract
- Add to MetaCart
Consistent word selection in machine translation is currently realized by resolving word sense ambiguity through the context of a single sentence or neighboring sentences. However, consistent word selection over the whole article has yet to be achieved. Consistency over the whole article is extremely important when applying machine translation to collectively developed documents like Wikipedia. In this paper, we propose to consider constraints between words in the whole article based on their semantic relatedness and contextual distance. The proposed method is successfully implemented in both statistical and rule-based translators. We evaluate those systems by translating 100 articles in the English Wikipedia into Japanese. The results show that the ratio of appropriate word selection for common nouns increased to around 75 % with our method, while it was around 55% without our method. 1

