Results 1  10
of
48
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract

Cited by 59 (5 self)
 Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of stateoftheart SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
Hierarchical phrasebased translation with weighted finite state transducers and . . .
 IN PROCEEDINGS OF HLT/NAACL
, 2010
"... In this article we describe HiFST, a latticebased decoder for hierarchical phrasebased translation and alignment. The decoder is implemented with standard Weighted FiniteState Transducer (WFST) operations as an alternative to the wellknown cube pruning procedure. We find that the use of WFSTs ra ..."
Abstract

Cited by 30 (14 self)
 Add to MetaCart
(Show Context)
In this article we describe HiFST, a latticebased decoder for hierarchical phrasebased translation and alignment. The decoder is implemented with standard Weighted FiniteState Transducer (WFST) operations as an alternative to the wellknown cube pruning procedure. We find that the use of WFSTs rather than kbest lists requires less pruning in translation search, resulting in fewer search errors, better parameter optimization, and improved translation performance. The direct generation of translation lattices in the target language can improve subsequent rescoring procedures, yielding further gains when applying longspan language models and Minimum Bayes Risk decoding. We also provide insights as to how to control the size of the search space defined by hierarchical rules. We show that shallown grammars, lowlevel rule catenation, and other search constraints can help to match the power of the translation system to specific language pairs.
An efficient twopass approach to synchronouscfg driven statistical mt
 In Proc. of HLTNAACL
, 2007
"... We present an efficient, novel twopass approach to mitigate the computational impact resulting from online intersection of an ngram language model (LM) and a probabilistic synchronous contextfree grammar (PSCFG) for statistical machine translation. In first pass CYKstyle decoding, we consider fi ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
(Show Context)
We present an efficient, novel twopass approach to mitigate the computational impact resulting from online intersection of an ngram language model (LM) and a probabilistic synchronous contextfree grammar (PSCFG) for statistical machine translation. In first pass CYKstyle decoding, we consider firstbest chart item approximations, generating a hypergraph of sentence spanning target language derivations. In the second stage, we instantiate specific alternative derivations from this hypergraph, using the LM to drive this search process, recovering from search errors made in the first pass. Model search errors in our approach are comparable to those made by the stateoftheart “Cube Pruning ” approach in (Chiang, 2007) under comparable pruning conditions evaluated on both hierarchical and syntaxbased grammars. 1
Rule filtering by pattern for efficient hierarchical translation
 In Proceedings of the EACL
, 2009
"... We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on th ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on the number of nonterminals and the pattern, and various filtering strategies are then applied to assess the impact on translation speed and quality. Results are reported on the 2008 NIST ArabictoEnglish evaluation task. 1
Restructuring, Relabeling, and Realigning for SyntaxBased Machine Translation
"... Language Weaver, Inc. This article shows that the structure of bilingual material from standard parsing and alignment tools is not optimal for training syntaxbased statistical machine translation (SMT) systems. We present three modifications to the MT training data to improve the accuracy of a stat ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
Language Weaver, Inc. This article shows that the structure of bilingual material from standard parsing and alignment tools is not optimal for training syntaxbased statistical machine translation (SMT) systems. We present three modifications to the MT training data to improve the accuracy of a stateoftheart syntax MT system:restructuring changes the syntactic structure of training parse trees to enable reuse of substructures; relabeling alters bracket labels to enrich rule application context; and realigning unifies word alignment across sentences to remove bad word alignments and refine good ones. Better structures, labels, and word alignments are learned by the EM algorithm. We show that each individual technique leads to improvement as measured by BLEU, and we also show that the greatest improvement is achieved by combining them. We report an overall 1.48 BLEU improvement on the NIST08 evaluation set over a strong baseline in Chinese/English translation. 1. Background Syntactic methods have recently proven useful in statistical machine translation (SMT). In this article, we explore different ways of exploiting the structure of bilingual material for syntaxbased SMT. In particular, we ask what kinds of tree structures, tree labels, and word alignments are best suited for improving endtoend translation accuracy. We begin with structures from standard parsing and alignment tools, then use the EM algorithm to revise these structures in light of the translation task. We report an overall +1.48 BLEU improvement on a standard ChinesetoEnglish test.
Syntactic realignment models for machine translation
 In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLPCoNLL2007
, 2007
"... We present a method for improving word alignment for statistical syntaxbased machine translation that employs a syntactically informed alignment model closer to the translation model than commonlyused word alignment models. This leads to extraction of more useful linguistic patterns and improved B ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
We present a method for improving word alignment for statistical syntaxbased machine translation that employs a syntactically informed alignment model closer to the translation model than commonlyused word alignment models. This leads to extraction of more useful linguistic patterns and improved BLEU scores on translation experiments in Chinese and Arabic. 1 Methods of statistical MT Roughly speaking, there are two paths commonly taken in statistical machine translation (Figure 1). The idealistic path uses an unsupervised learning algorithm such as EM (Demptser et al., 1977) to learn parameters for some proposed translation model from a bitext training corpus, and then directly translates using the weighted model. Some examples of the idealistic approach are the direct IBM word model (Berger et al., 1994; Germann et al., 2001), the phrasebased approach of Marcu and Wong (2002), and the syntax approaches of Wu (1996) and Yamada and Knight (2001). Idealistic approaches are conceptually simple and thus easy to relate to observed phenomena. However, as more parameters are added to the model the idealistic approach has not scaled well, for it is increasingly difficult to incorporate large amounts of training data efficiently over an increasingly large search space. Additionally, the EM procedure has a tendency to overfit its training data when the input units have varying explanatory powers, such as variablesize phrases or variableheight trees.
A Bayesian model of syntaxdirected tree to string grammar induction
 In Proceedings of the Conference on Emprical Methods for Natural Language Processing
, 2009
"... Tree based translation models are a compelling means of integrating linguistic information into machine translation. Syntax can inform lexical selection and reordering choices and thereby improve translation quality. Research to date has focussed primarily on decoding with such models, but less ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
Tree based translation models are a compelling means of integrating linguistic information into machine translation. Syntax can inform lexical selection and reordering choices and thereby improve translation quality. Research to date has focussed primarily on decoding with such models, but less on the difficult problem of inducing the bilingual grammar from data. We propose a generative Bayesian model of treetostring translation which induces grammars that are both smaller and produce better translations than the previous heuristic twostage approach which employs a separate word alignment step. 1
Efficient Parsing for Transducer Grammars
"... The treetransducer grammars that arise in current syntactic machine translation systems are large, flat, and highly lexicalized. We address the problem of parsing efficiently with such grammars in three ways. First, we present a pair of grammar transformations that admit an efficient cubictime CKY ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
The treetransducer grammars that arise in current syntactic machine translation systems are large, flat, and highly lexicalized. We address the problem of parsing efficiently with such grammars in three ways. First, we present a pair of grammar transformations that admit an efficient cubictime CKYstyle parsing algorithm despite leaving most of the grammar in nary form. Second, we show how the number of intermediate symbols generated by this transformation can be substantially reduced through binarization choices. Finally, we describe a twopass coarsetofine parsing approach that prunes the search space using predictions from a subset of the original grammar. In all, parsing time reduces by 81%. We also describe a coarsetofine pruning scheme for forestbased language model reranking that allows a 100fold increase in beam size while reducing decoding time. The resulting translations improve by 1.3 BLEU. 1
Factorization of synchronous contextfree grammars in linear time
 In NAACL Workshop on Syntax and Structure in Statistical Translation (SSST
, 2007
"... Factoring a Synchronous ContextFree Grammar into an equivalent grammar with a smaller number of nonterminals in each rule enables synchronous parsing algorithms of lower complexity. The problem can be formalized as searching for the treedecomposition of a given permutation with the minimal branchi ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
Factoring a Synchronous ContextFree Grammar into an equivalent grammar with a smaller number of nonterminals in each rule enables synchronous parsing algorithms of lower complexity. The problem can be formalized as searching for the treedecomposition of a given permutation with the minimal branching factor. In this paper, by modifying the algorithm of Uno and Yagiura (2000) for the closely related problem of finding all common intervals of two permutations, we achieve a linear time algorithm for the permutation factorization problem. We also use the algorithm to analyze the maximum SCFG rule length needed to cover handaligned data from various language pairs. 1
Prior derivation models for formally syntaxbased translation using linguistically syntactic parsing and tree kernels
 In Proceedings of the ACL’08: HLT SSST2
, 2008
"... This paper presents an improved formally syntaxbased SMT model, which is enriched by linguistically syntactic knowledge obtained from statistical constituent parsers. We propose a linguisticallymotivated prior derivation model to score hypothesis derivations on top of the baseline model during the ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
This paper presents an improved formally syntaxbased SMT model, which is enriched by linguistically syntactic knowledge obtained from statistical constituent parsers. We propose a linguisticallymotivated prior derivation model to score hypothesis derivations on top of the baseline model during the translation decoding. Moreover, we devise a fast training algorithm to achieve such improved models based on tree kernel methods. Experiments on an EnglishtoChinese task demonstrate that our proposed models outperformed the baseline formally syntaxbased models, while both of them achieved