• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Indirect-HMM-based hypothesis alignment for combining outputs from machine translation systems (2008)

by X He, M Yang, J Gao, P Nguyen, R Moore
Venue:in Proc. EMNLP
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 15
Next 10 →

Incremental hypothesis alignment with flexible matching for building confusion networks: BBN system description for WMT09 system combination task

by Antti-veikko I. Rosti, Bing Zhang, Spyros Matsoukas, Richard Schwartz - in Proc. WMT, 2009
"... This paper describes the incremental hypothesis alignment algorithm used in the BBN submissions to the WMT09 system combination task. The alignment algorithm used a sentence specific alignment order, flexible matching, and new shift heuristics. These refinements yield more compact confusion networks ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
This paper describes the incremental hypothesis alignment algorithm used in the BBN submissions to the WMT09 system combination task. The alignment algorithm used a sentence specific alignment order, flexible matching, and new shift heuristics. These refinements yield more compact confusion networks compared to using the pair-wise or incremental TER alignment algorithms. This should reduce the number of spurious insertions in the system combination output and the system combination weight tuning converges faster. System combination experiments on the WMT09 test sets from five source languages to English are presented. The best BLEU scores were achieved by combing the English outputs of three systems from all five source languages. 1

Joint Decoding with Multiple Translation Models

by Yang Liu, Haitao Mi, Yang Feng, Qun Liu
"... Current SMT systems usually decode with single translation models and cannot benefit from the strengths of other models in decoding phase. We instead propose joint decoding, a method that combines multiple translation models in one decoder. Our joint decoder draws connections among multiple models b ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
Current SMT systems usually decode with single translation models and cannot benefit from the strengths of other models in decoding phase. We instead propose joint decoding, a method that combines multiple translation models in one decoder. Our joint decoder draws connections among multiple models by integrating the translation hypergraphs they produce individually. Therefore, one model can share translations and even derivations with other models. Comparable to the state-of-the-art system combination technique, joint decoding achieves an absolute improvement of 1.5 BLEU points over individual decoding. 1

Model combination for machine translation

by John Denero, Shankar Kumar, Ciprian Chelba, Franz Och - In Proceedings NAACL-HLT , 2010
"... Machine translation benefits from two types of decoding techniques: consensus decoding over multiple hypotheses under a single model and system combination over hypotheses from different models. We present model combination, a method that integrates consensus decoding and system combination into a u ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
Machine translation benefits from two types of decoding techniques: consensus decoding over multiple hypotheses under a single model and system combination over hypotheses from different models. We present model combination, a method that integrates consensus decoding and system combination into a unified, forest-based technique. Our approach makes few assumptions about the underlying component models, enabling us to combine systems with heterogenous structure. Unlike most system combination techniques, we reuse the search space of component models, which entirely avoids the need to align translation hypotheses. Despite its relative simplicity, model combination improves translation quality over a pipelined approach of first applying consensus decoding to individual systems, and then applying system combination to their output. We demonstrate BLEU improvements across data sets and language pairs in large-scale experiments. 1

Active Learning for Multilingual Statistical Machine Translation ∗

by Gholamreza Haffari, Anoop Sarkar
"... Statistical machine translation (SMT) models require bilingual corpora for training, and these corpora are often multilingual with parallel text in multiple languages simultaneously. We introduce an active learning task of adding a new language to an existing multilingual set of parallel text and co ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
Statistical machine translation (SMT) models require bilingual corpora for training, and these corpora are often multilingual with parallel text in multiple languages simultaneously. We introduce an active learning task of adding a new language to an existing multilingual set of parallel text and constructing high quality MT systems, from each language in the collection into this new target language. We show that adding a new language using active learning to the EuroParl corpus provides a significant improvement compared to a random sentence selection baseline. We also provide new highly effective sentence selection methods that improve AL for phrase-based SMT in the multilingual and single language pair setting. 1

Combining multi-engine translations with moses

by Yu Chen, Michael Jellinghaus, Andreas Eisele, Yi Zhang, Sabine Hunsicker, Silke Theison, Christian Federmann, Hans Uszkoreit - In Proceedings of the Fourth Workshop on Statistical Machine Translation , 2009
"... We present a simple method for generating translations with the Moses toolkit (Koehn et al., 2007) from existing hypotheses produced by other translation engines. As the structures underlying these translation engines are not known, an evaluationbased strategy is applied to select systems for combin ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
We present a simple method for generating translations with the Moses toolkit (Koehn et al., 2007) from existing hypotheses produced by other translation engines. As the structures underlying these translation engines are not known, an evaluationbased strategy is applied to select systems for combination. The experiments show promising improvements in terms of BLEU. 1

Lattice-based system combination for statistical machine translation

by Yang Feng, Yang Liu, Haitao Mi, Qun Liu, Yajuan Lü - In Proceedings of EMNLP , 2009
"... Current system combination methods usually use confusion networks to find consensus translations among different systems. Requiring one-to-one mappings between the words in candidate translations, confusion networks have difficulty in handling more general situations in which several words are conne ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Current system combination methods usually use confusion networks to find consensus translations among different systems. Requiring one-to-one mappings between the words in candidate translations, confusion networks have difficulty in handling more general situations in which several words are connected to another several words. Instead, we propose a lattice-based system combination model that allows for such phrase alignments and uses lattices to encode all candidate translations. Experiments show that our approach achieves significant improvements over the state-ofthe-art baseline system on Chinese-to-English translation test sets. 1

Sequential system combination for machine translation of speech

by Damianos Karakos, Sanjeev Khudanpur - in Proc. IEEE SLT-08 , 2008
"... System combination is a technique which has been shown to yield significant gains in speech recognition and machine translation. Most combination schemes perform an alignment between different system outputs in order to produce lattices (or confusion networks), from which a composite hypothesis is c ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
System combination is a technique which has been shown to yield significant gains in speech recognition and machine translation. Most combination schemes perform an alignment between different system outputs in order to produce lattices (or confusion networks), from which a composite hypothesis is chosen, possibly with the help of a large language model. The benefit of this approach is two-fold: (i) whenever many systems agree with each other on a set of words, the combination output contains these words with high confidence; and (ii) whenever the systems disagree, the language model resolves the ambiguity based on the (probably correct) agreedupon context. The case of machine translation system combination is more challenging because of the different word orders of the translations: the alignment has to incorporate computationally expensive movements of word blocks. In this paper, we show how one can combine translation outputs efficiently, extending the incremental alignment procedure of [1]. A comparison between different system combination design choices is performed on an Arabic speech translation task.

The RWTH System Combination System for WMT 2009

by Gregor Leusch, Evgeny Matusov, Hermann Ney
"... RWTH participated in the System Combination task of the Fourth Workshop on Statistical Machine Translation (WMT 2009). Hypotheses from 9 German→English MT systems were combined into a consensus translation. This consensus translation scored 2.1 % better in BLEU and 2.3% better in TER (abs.) than the ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
RWTH participated in the System Combination task of the Fourth Workshop on Statistical Machine Translation (WMT 2009). Hypotheses from 9 German→English MT systems were combined into a consensus translation. This consensus translation scored 2.1 % better in BLEU and 2.3% better in TER (abs.) than the best single system. In addition, cross-lingual output from 10 French, German, and Spanish→English systems was combined into a consensus translation, which gave an improvement of 2.0 % in BLEU/3.5 % in TER (abs.) over the best single system. 1

HYPOTHESIS RANKING AND TWO-PASS APPROACHES FOR MACHINE TRANSLATION SYSTEM COMBINATION ∗

by Damianos Karakos, Jason Smith, Sanjeev Khudanpur
"... Given a number of machine translations of a source segment, the goal of system combination is to produce a new translation that has better quality than all of them. This paper describes a number of improvements that were recently added to the JHU system combination scheme: (i) A hypothesis ranking t ..."
Abstract - Add to MetaCart
Given a number of machine translations of a source segment, the goal of system combination is to produce a new translation that has better quality than all of them. This paper describes a number of improvements that were recently added to the JHU system combination scheme: (i) A hypothesis ranking technique which orders the system outputs, on a per-segment basis, according to predicted translation quality, thus improving a subsequent incremental combination step. (ii) A two-pass combination procedure, which first produces several combination outputs with the given translations, and then performs one more combination step with these new outputs. Results from the NIST MT09 informal system combination evaluation on Arabic-to-English and Urdu-to-English1 show that both approaches offer significant BLEU and TER gains over a baseline JHU combination scheme.

Incremental HMM Alignment for MT System Combination

by Chi-ho Li, Yupeng Liu, Xiaodong He, Ning Xi
"... Inspired by the incremental TER alignment, we re-designed the Indirect HMM (IHMM) alignment, which is one of the best hypothesis alignment methods for conventional MT system combination, in an incremental manner. One crucial problem of incremental alignment is to align a hypothesis to a confusion ne ..."
Abstract - Add to MetaCart
Inspired by the incremental TER alignment, we re-designed the Indirect HMM (IHMM) alignment, which is one of the best hypothesis alignment methods for conventional MT system combination, in an incremental manner. One crucial problem of incremental alignment is to align a hypothesis to a confusion network (CN). Our incremental IHMM alignment is implemented in three different ways: 1) treat CN spans as HMM states and define state transition as distortion over covered n-grams between two spans; 2) treat CN spans as HMM states and define state transition as distortion over words in component translations in the CN; and 3) use a consensus decoding algorithm over one hypothesis and multiple IHMMs, each of which corresponds to a component translation in the CN. All these three approaches of incremental alignment based on IHMM are shown to be superior to both incremental TER alignment and conventional IHMM alignment in the setting of the Chinese-to-English track of the 2008 NIST Open MT evaluation. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University