Results 1  10
of
99
A Systematic Comparison of Various Statistical Alignment Models
 COMPUTATIONAL LINGUISTICS
, 2003
"... ..."
Hierarchical phrasebased translation
 Computational Linguistics
, 2007
"... We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous contextfree grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from b ..."
Abstract

Cited by 588 (9 self)
 Add to MetaCart
We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous contextfree grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from both syntaxbased translation and phrasebased translation. We describe our system’s training and decoding methods in detail, and evaluate it for translation speed and translation accuracy. Using BLEU as a metric of translation accuracy, we find that our system performs significantly better than the Alignment Template System, a stateoftheart phrasebased system. 1.
Decoding Complexity in WordReplacement Translation Models
 Computational Linguistics
, 1999
"... This paper looks at decoding complexity. ..."
(Show Context)
Improving statistical machine translation using word sense disambiguation
 In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
, 2007
"... We show for the first time that incorporating the predictions of a word sense disambiguation system within a typical phrasebased statistical machine translation (SMT) model consistently improves translation quality across all three different IWSLT ChineseEnglish test sets, as well as producing st ..."
Abstract

Cited by 125 (7 self)
 Add to MetaCart
We show for the first time that incorporating the predictions of a word sense disambiguation system within a typical phrasebased statistical machine translation (SMT) model consistently improves translation quality across all three different IWSLT ChineseEnglish test sets, as well as producing statistically significant improvements on the larger NIST ChineseEnglish MT task— and moreover never hurts performance on any test set, according not only to BLEU but to all eight most commonly used automatic evaluation metrics. Recent work has challenged the assumption that word sense disambiguation (WSD) systems are useful for SMT. Yet SMT translation quality still obviously suffers from inaccurate lexical choice. In this paper, we address this problem by investigating a new strategy for integrating WSD into an SMT system, that performs fully phrasal multiword disambiguation. Instead of directly incorporating a Sensevalstyle WSD system, we redefine the WSD task to match the exact same phrasal translation disambiguation task faced by phrasebased SMT systems. Our results provide the first known empirical evidence that lexical semantics are indeed useful for SMT, despite claims to the contrary. ∗This material is based upon work supported in part by
Online LargeMargin Training of Syntactic and Structural Translation Features
"... Minimumerrorrate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative ..."
Abstract

Cited by 118 (12 self)
 Add to MetaCart
(Show Context)
Minimumerrorrate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative to MERT. We first show that by parallel processing and exploiting more of the parse forest, we can obtain results using MIRA that match or surpass MERT in terms of both translation quality and computational cost. We then test the method on two classes of features that address deficiencies in the Hiero hierarchical phrasebased model: first, we simultaneously train a large number of Marton and Resnik’s soft syntactic constraints, and, second, we introduce a novel structural distortion model. In both cases we obtain significant improvements in translation performance. Optimizing them in combination, for a total of 56 feature weights, we improve performance by 2.6 Bleu on a subset of the NIST 2006 ArabicEnglish evaluation data.
Parsing InsideOut
, 1998
"... Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract

Cited by 99 (2 self)
 Add to MetaCart
(Show Context)
Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given nonterminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and nbest lists. We also present three novel uses for the inside and outside probabilities. T...
Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation
 In Proceedings of ACLCOLING 2006
, 2006
"... We propose a novel reordering model for phrasebased statistical machine translation (SMT) that uses a maximum entropy (MaxEnt) model to predicate reorderings of neighbor blocks (phrase pairs). The model provides contentdependent, hierarchical phrasal reordering with generalization based on feat ..."
Abstract

Cited by 94 (24 self)
 Add to MetaCart
(Show Context)
We propose a novel reordering model for phrasebased statistical machine translation (SMT) that uses a maximum entropy (MaxEnt) model to predicate reorderings of neighbor blocks (phrase pairs). The model provides contentdependent, hierarchical phrasal reordering with generalization based on features automatically learned from a realworld bitext. We present an algorithm to extract all reordering events of neighbor blocks from bilingual data. In our experiments on ChinesetoEnglish translation, this MaxEntbased reordering model obtains significant improvements in BLEU score on the NIST MT05 and IWSLT04 tasks. 1
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract

Cited by 93 (6 self)
 Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of humanproduced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of stateoftheart SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
Statistical Machine Translation by Parsing
, 2004
"... In an ordinary syntactic parser, the input is a string, and the grammar ranges over strings. This paper explores generalizations of ordinary parsing algorithms that allow the input to consist of string tuples and/or the grammar to range over string tuples. Such algorithms can infer the synchronous s ..."
Abstract

Cited by 77 (6 self)
 Add to MetaCart
In an ordinary syntactic parser, the input is a string, and the grammar ranges over strings. This paper explores generalizations of ordinary parsing algorithms that allow the input to consist of string tuples and/or the grammar to range over string tuples. Such algorithms can infer the synchronous structures hidden in parallel texts. It turns out that these generalized parsers can do most of the work required to train and apply a syntaxaware statistical machine translation system.
A unigram orientation model for statistical machine translation
 In Proceedings of HLTNAACL 2004: Short Papers
, 2004
"... In this paper, we present a unigram segmentation model for statistical machine translation where the segmentation units are blocks: pairs of phrases without internal structure. The segmentation model uses a novel orientation component to handle swapping of neighbor blocks. During training, we coll ..."
Abstract

Cited by 63 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we present a unigram segmentation model for statistical machine translation where the segmentation units are blocks: pairs of phrases without internal structure. The segmentation model uses a novel orientation component to handle swapping of neighbor blocks. During training, we collect block unigram counts with orientation: we count how often a block occurs to the left or to the right of some predecessor block. The orientation model is shown to improve translation performance over two models: 1) no block reordering is used, and 2) the block swapping is controlled only by a language model. We show experimental results on a standard ArabicEnglish translation task. 1