Results 1  10
of
56
Hierarchical phrasebased translation with weighted finite state transducers and . . .
 IN PROCEEDINGS OF HLT/NAACL
, 2010
"... In this article we describe HiFST, a latticebased decoder for hierarchical phrasebased translation and alignment. The decoder is implemented with standard Weighted FiniteState Transducer (WFST) operations as an alternative to the wellknown cube pruning procedure. We find that the use of WFSTs ra ..."
Abstract

Cited by 41 (16 self)
 Add to MetaCart
(Show Context)
In this article we describe HiFST, a latticebased decoder for hierarchical phrasebased translation and alignment. The decoder is implemented with standard Weighted FiniteState Transducer (WFST) operations as an alternative to the wellknown cube pruning procedure. We find that the use of WFSTs rather than kbest lists requires less pruning in translation search, resulting in fewer search errors, better parameter optimization, and improved translation performance. The direct generation of translation lattices in the target language can improve subsequent rescoring procedures, yielding further gains when applying longspan language models and Minimum Bayes Risk decoding. We also provide insights as to how to control the size of the search space defined by hierarchical rules. We show that shallown grammars, lowlevel rule catenation, and other search constraints can help to match the power of the translation system to specific language pairs.
Bayesian Synchronous Grammar Induction
"... We present a novel method for inducing synchronous context free grammars (SCFGs) from a corpus of parallel string pairs. SCFGs can model equivalence between strings in terms of substitutions, insertions and deletions, and the reordering of substrings. We develop a nonparametric Bayesian model and ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
(Show Context)
We present a novel method for inducing synchronous context free grammars (SCFGs) from a corpus of parallel string pairs. SCFGs can model equivalence between strings in terms of substitutions, insertions and deletions, and the reordering of substrings. We develop a nonparametric Bayesian model and apply it to a machine translation task, using priors to replace the various heuristics commonly used in this field. Using a variational Bayes training procedure, we learn the latent structure of translation equivalence through the induction of synchronous grammar categories for phrasal translations, showing improvements in translation performance over maximum likelihood models. 1
A unified framework for phrasebased, hierarchical, and syntaxbased statistical machine translation
 In Proceedings of the International Workshop on Spoken Language Translation (IWSLT
, 2009
"... Despite many differences between phrasebased, hierarchical, and syntaxbased translation models, their training and testing pipelines are strikingly similar. Drawing on this fact, we extend the Moses toolkit to implement hierarchical and syntactic models, making it the first open source toolkit wit ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
(Show Context)
Despite many differences between phrasebased, hierarchical, and syntaxbased translation models, their training and testing pipelines are strikingly similar. Drawing on this fact, we extend the Moses toolkit to implement hierarchical and syntactic models, making it the first open source toolkit with endtoend support for all three of these popular models in a single package. This extension substantially lowers the barrier to entry for machine translation research across multiple models. 1.
Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation
"... We propose a novel probabilistic synchoronous contextfree grammar formalism for statistical machine translation, in which syntactic nonterminal labels are represented as “soft ” preferences rather than as “hard” matching constraints. This formalism allows us to efficiently score unlabeled synchrono ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
(Show Context)
We propose a novel probabilistic synchoronous contextfree grammar formalism for statistical machine translation, in which syntactic nonterminal labels are represented as “soft ” preferences rather than as “hard” matching constraints. This formalism allows us to efficiently score unlabeled synchronous derivations without forgoing traditional syntactic constraints. Using this score as a feature in a loglinear model, we are able to approximate the selection of the most likely unlabeled derivation. This helps reduce fragmentation of probability across differently labeled derivations of the same translation. It also allows the importance of syntactic preferences to be learned alongside other features (e.g., the language model) and for particular labeling procedures. We show improvements in translation quality on small and medium sized ChinesetoEnglish translation tasks. 1
Rule filtering by pattern for efficient hierarchical translation
 In Proceedings of the EACL
, 2009
"... We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on th ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory usage through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in translation. Rules are put into syntactic classes based on the number of nonterminals and the pattern, and various filtering strategies are then applied to assess the impact on translation speed and quality. Results are reported on the 2008 NIST ArabictoEnglish evaluation task. 1
SemanticsBased Machine Translation with . . .
"... We present an approach to semanticsbased statistical machine translation that uses synchronous hyperedge replacement grammars to translate into and from graphshaped intermediate meaning representations, to our knowledge the first work in NLP to make use of synchronous context free graph grammars. ..."
Abstract

Cited by 22 (7 self)
 Add to MetaCart
We present an approach to semanticsbased statistical machine translation that uses synchronous hyperedge replacement grammars to translate into and from graphshaped intermediate meaning representations, to our knowledge the first work in NLP to make use of synchronous context free graph grammars. We present algorithms for each step of the semanticsbased translation pipeline, including a novel graphtoword alignment algorithm and two algorithms for synchronous grammar rule extraction. We investigate the influence of syntactic annotations on semanticsbased translation by presenting two alternative rule extraction algorithms, one that requires only semantic annotations and another that additionally relies on syntactic annotations, and explore the effect of syntax and language bias in meaning representation structures by running experiments with two different meaning representations, one biased toward an English syntaxlike structure and another that is language neutral. While preliminary work, these experiments show promise for semanticallyinformed machine translation.
Analysing Soft Syntax Features and Heuristics for Hierarchical Phrase Based Machine Translation
 In Proc. of the Int. Workshop on Spoken Language Translation (IWSLT
, 2008
"... Similar to phrasebased machine translation, hierarchical systems produce a large proportion of phrases, most of which are supposedly junk and useless for the actual translation. For the hierarchical case, however, the amount of extracted rules is an order of magnitude bigger. In this paper, we inve ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
Similar to phrasebased machine translation, hierarchical systems produce a large proportion of phrases, most of which are supposedly junk and useless for the actual translation. For the hierarchical case, however, the amount of extracted rules is an order of magnitude bigger. In this paper, we investigate several soft constraints in the extraction of hierarchical phrases and whether these help as additional scores in the decoding to prune unneeded phrases. We show the methods that help best. 1.
Wider Pipelines: NBest Alignments and Parses in MT Training
"... Stateoftheart statistical machine translation systems use hypotheses from several maximum a posteriori inference steps, including word alignments and parse trees, to identify translational structure and estimate the parameters of translation models. While this approach leads to a modular pipeline ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Stateoftheart statistical machine translation systems use hypotheses from several maximum a posteriori inference steps, including word alignments and parse trees, to identify translational structure and estimate the parameters of translation models. While this approach leads to a modular pipeline of independently developed components, errors made in these “singlebest ” hypotheses can propagate to downstream estimation steps that treat these inputs as clean, trustworthy training data. In this work we integrate Nbest alignments and parses by using a probability distribution over these alternatives to generate posterior fractional counts for use in downstream estimation. Using these fractional counts in a DOPinspired syntaxbased translation system, we show significant improvements in translation quality over a singlebest trained baseline. 1
Contextfree reordering, finitestate translation
 In Proc. of HLTNAACL
, 2010
"... We describe a class of translation model in which a set of input variants encoded as a contextfree forest is translated using a finitestate translation model. The forest structure of the input is wellsuited to representing word order alternatives, making it straightforward to model translation as ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
We describe a class of translation model in which a set of input variants encoded as a contextfree forest is translated using a finitestate translation model. The forest structure of the input is wellsuited to representing word order alternatives, making it straightforward to model translation as a two step process: (1) treebased source reordering and (2) phrase transduction. By treating the reordering process as a latent variable in a probabilistic translation model, we can learn a longrange source reordering model without example reordered sentences, which are problematic to construct. The resulting model has stateoftheart translation performance, uses linguistically motivated features to effectively model long range reordering, and is significantly smaller than a comparable hierarchical phrasebased translation model. 1
NonProjective Parsing for Statistical Machine Translation
"... We describe a novel approach for syntaxbased statistical MT, which builds on a variant of tree adjoining grammar (TAG). Inspired by work in discriminative dependency parsing, the key idea in our approach is to allow highly flexible reordering operations during parsing, in combination with a discrimi ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We describe a novel approach for syntaxbased statistical MT, which builds on a variant of tree adjoining grammar (TAG). Inspired by work in discriminative dependency parsing, the key idea in our approach is to allow highly flexible reordering operations during parsing, in combination with a discriminative model that can condition on rich features of the sourcelanguage string. Experiments on translation from German to English show improvements over phrasebased systems, both in terms of BLEU scores and in human evaluations. 1