Results 1  10
of
68
A Syntaxbased Statistical Translation Model
, 2001
"... We present a syntaxbased statistical translation model. Our model transforms a sourcelanguage parse tree into a targetlanguage string by applying stochastic operations at each node. These operations capture linguistic differences such as word order and case marking. Model parameters are es ..."
Abstract

Cited by 338 (15 self)
 Add to MetaCart
(Show Context)
We present a syntaxbased statistical translation model. Our model transforms a sourcelanguage parse tree into a targetlanguage string by applying stochastic operations at each node. These operations capture linguistic differences such as word order and case marking. Model parameters are estimated in polynomial time using an EM algorithm. The model produces word alignments that are better than those produced by IBM Model 5. 1
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract

Cited by 93 (6 self)
 Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of stateoftheart SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
Improvements in PhraseBased Statistical Machine Translation
 In Proc. of the Human Language Technology Conf. (HLTNAACL
, 2004
"... In statistical machine translation, the currently best performing systems are based in some way on phrases or word groups. We describe the baseline phrasebased translation system and various refinements. We describe a highly efficient monotone search algorithm with a complexity linear in the ..."
Abstract

Cited by 88 (16 self)
 Add to MetaCart
(Show Context)
In statistical machine translation, the currently best performing systems are based in some way on phrases or word groups. We describe the baseline phrasebased translation system and various refinements. We describe a highly efficient monotone search algorithm with a complexity linear in the input sentence length. We present translation results for three tasks: Verbmobil, Xerox and the Canadian Hansards. For the Xerox task, it takes less than 7 seconds to translate the whole test set consisting of more than 10K words. The translation results for the Xerox and Canadian Hansards task are very promising. The system even outperforms the alignment template system.
A Decoder for Syntaxbased Statistical MT
 PROCEEDINGS OF THE 40TH ANNIVERSARY MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL02
, 2002
"... This paper describes a decoding algorithm for a syntaxbased translation model (Yamada and Knight, 2001). The model has been extended to incorporate phrasal translations as presented here. In contrast to a conventional wordtoword statistical model, a decoder for the syntaxbased model builds ..."
Abstract

Cited by 62 (2 self)
 Add to MetaCart
This paper describes a decoding algorithm for a syntaxbased translation model (Yamada and Knight, 2001). The model has been extended to incorporate phrasal translations as presented here. In contrast to a conventional wordtoword statistical model, a decoder for the syntaxbased model builds up an English parse tree given a sentence in a foreign language.
Novel reordering approaches in phrasebased statistical machine translation
 Proceedings of the ACL Workshop on Building and Using Parallel Texts: DataDriven Machine Translation and Beyond
, 2005
"... This paper presents novel approaches to reordering in phrasebased statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical translation model. Using this model, we follow a phrasebased monotonic machine translation approach, for wh ..."
Abstract

Cited by 44 (14 self)
 Add to MetaCart
(Show Context)
This paper presents novel approaches to reordering in phrasebased statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical translation model. Using this model, we follow a phrasebased monotonic machine translation approach, for which we develop an efficient and flexible reordering framework that allows to easily introduce different reordering constraints. In translation, we apply source sentence reordering on word level and use a reordering automaton as input. We show how to compute reordering automata ondemand using IBM or ITG constraints, and also introduce two new types of reordering constraints. We further add weights to the reordering automata. We present detailed experimental results and show that reordering significantly improves translation quality. 1
A Comparative Study on Reordering Constraints in Statistical Machine Translation
, 2003
"... In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NPhard. On the other hand, if we restrict the possible wordreorderings in an appropriate way, we obtain a polynomial ..."
Abstract

Cited by 44 (1 self)
 Add to MetaCart
In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NPhard. On the other hand, if we restrict the possible wordreorderings in an appropriate way, we obtain a polynomialtime search algorithm.
Stochastic lexicalized inversion transduction grammar for alignment
 In Proc. of ACL
, 2005
"... We present a version of Inversion Transduction Grammar where rule probabilities are lexicalized throughout the synchronous parse tree, along with pruning techniques for efficient training. Alignment results improve over unlexicalized ITG on short sentences for which full EM is feasible, but pruning ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
We present a version of Inversion Transduction Grammar where rule probabilities are lexicalized throughout the synchronous parse tree, along with pruning techniques for efficient training. Alignment results improve over unlexicalized ITG on short sentences for which full EM is feasible, but pruning seems to have a negative impact on longer sentences. 1
Statistical approaches to computerassisted translation,” Computational Linguistics, p
, 2008
"... Current machine translation (MT) systems are still not perfect. In practice, the output from these systems needs to be edited to correct errors. A way of increasing the productivity of the whole translation process (MT plus human work) is to incorporate the human correction activities within the tra ..."
Abstract

Cited by 35 (14 self)
 Add to MetaCart
(Show Context)
Current machine translation (MT) systems are still not perfect. In practice, the output from these systems needs to be edited to correct errors. A way of increasing the productivity of the whole translation process (MT plus human work) is to incorporate the human correction activities within the translation process itself, thereby shifting the MT paradigm to that of computerassisted translation. This model entails an iterative process in which the human translator activity is included in the loop: In each iteration, a prefix of the translation is validated (accepted or amended) by the human and the system computes its best (or nbest) translation suffix hypothesis to complete this prefix. A successful framework for MT is the socalled statistical (or pattern recognition) framework. Interestingly, within this framework, the adaptation of MT systems to the interactive scenario affects mainly the search process, allowing a great reuse of successful techniques and models. In this article, alignment templates, phrasebased models, and stochastic finitestate transducers are used to develop computerassisted translation systems. These systems were assessed in a European project (TransType2) in two real tasks: The translation of printer manuals; manuals and the translation of the Bulletin of the European Union. In each task, the following three pairs of languages were involved (in both translation
Greedy Decoding for Statistical Machine Translation in Almost Linear Time
, 2003
"... We present improvements to a greedy decoding algorithm for statistical machine translation that reduce its time complexity from at least cubic (O(n^6) when applied navely) to practically linear time without sacrificing translation quality. We achieve this by integrating hypothesis evaluati ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
We present improvements to a greedy decoding algorithm for statistical machine translation that reduce its time complexity from at least cubic (O(n^6) when applied navely) to practically linear time without sacrificing translation quality. We achieve this by integrating hypothesis evaluation into hypothesis creation, tiling improvements over the translation hypothesis at the end of each search iteration, and by imposing restrictions on the amount of word reordering during decoding.
Distortion models for statistical machine translation
 In ACL
, 2006
"... In this paper, we argue that ngram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be used with existing phrasebased SMT decoders to address those ngram language model limitations. We present empirical resu ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
(Show Context)
In this paper, we argue that ngram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be used with existing phrasebased SMT decoders to address those ngram language model limitations. We present empirical results in Arabic to English Machine Translation that show statistically significant improvements when our proposed model is used. We also propose a novel metric to measure word order similarity (or difference) between any pair of languages based on word alignments. 1