Results 1  10
of
10
A unified framework for phrasebased, hierarchical, and syntaxbased statistical machine translation
 In Proceedings of the International Workshop on Spoken Language Translation (IWSLT
, 2009
"... Despite many differences between phrasebased, hierarchical, and syntaxbased translation models, their training and testing pipelines are strikingly similar. Drawing on this fact, we extend the Moses toolkit to implement hierarchical and syntactic models, making it the first open source toolkit wit ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
(Show Context)
Despite many differences between phrasebased, hierarchical, and syntaxbased translation models, their training and testing pipelines are strikingly similar. Drawing on this fact, we extend the Moses toolkit to implement hierarchical and syntactic models, making it the first open source toolkit with endtoend support for all three of these popular models in a single package. This extension substantially lowers the barrier to entry for machine translation research across multiple models. 1.
Monte carlo techniques for phrasebased translation
 Machine Translation
, 2010
"... Abstract. Recent advances in statistical machine translation have used approximate beam search for NPcomplete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sa ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract. Recent advances in statistical machine translation have used approximate beam search for NPcomplete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sampler for sampling translations given a source sentence and show that it effectively explores this posterior distribution. In doing so we overcome the limitations of heuristic beam search and obtain theoretically sound solutions to inference problems such as finding the maximum probability translation and minimum risk training and decoding.
SampleRank Training for PhraseBased Machine Translation
"... Statistical machine translation systems are normally optimised for a chosen gain function (metric) by using MERT to find the best model weights. This algorithm suffers from stability problems and cannot scale beyond 2030 features. We present an alternative algorithm for discriminative training of p ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Statistical machine translation systems are normally optimised for a chosen gain function (metric) by using MERT to find the best model weights. This algorithm suffers from stability problems and cannot scale beyond 2030 features. We present an alternative algorithm for discriminative training of phrasebased MT systems, SampleRank, which scales to hundreds of features, equals or beats MERT on both small and medium sized systems, and permits the use of sentence or document level features. SampleRank proceeds by repeatedly updating the model weights to ensure that the ranking of output sentences induced by the model is the same as that induced by the gain function. 1
DocumentWide Decoding for PhraseBased Statistical Machine Translation
"... Independence between sentences is an assumption deeply entrenched in the models and algorithms used for statistical machine translation (SMT), particularly in the popular dynamic programming beam search decoding algorithm. This restriction is an obstacle to research on more sophisticated discoursel ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Independence between sentences is an assumption deeply entrenched in the models and algorithms used for statistical machine translation (SMT), particularly in the popular dynamic programming beam search decoding algorithm. This restriction is an obstacle to research on more sophisticated discourselevel models for SMT. We propose a stochastic local search decoding method for phrasebased SMT, which permits free documentwide dependencies in the models. We explore the stability and the search parameters of this method and demonstrate that it can be successfully used to optimise a documentlevel semantic language model. 1
A Unified Approach to Minimum Risk Training and Decoding
"... We present a unified approach to performing minimum risk training and minimum Bayes risk (MBR) decoding with BLEU in a phrasebased model. Key to our approach is the use of a Gibbs sampler that allows us to explore the entire probability distribution and maintain a strict probabilistic formulation a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
We present a unified approach to performing minimum risk training and minimum Bayes risk (MBR) decoding with BLEU in a phrasebased model. Key to our approach is the use of a Gibbs sampler that allows us to explore the entire probability distribution and maintain a strict probabilistic formulation across the pipeline. We also describe a new sampling algorithm called corpus sampling which allows us at training time to use BLEU instead of an approximation thereof. Our approach is theoretically sound and gives better (up to +0.6%BLEU) and more stable results than the standard MERT optimization algorithm. By comparing our approach to lattice MBR, we are also able to gain crucial insights about both methods. 1
Investigations in Exact Inference for Hierarchical Translation
"... We present a method for inference in hierarchical phrasebased translation, where both optimisation and sampling are performed in a common exact inference framework related to adaptive rejection sampling. We also present a first implementation of that method along with experimental results shedding ..."
Abstract
 Add to MetaCart
(Show Context)
We present a method for inference in hierarchical phrasebased translation, where both optimisation and sampling are performed in a common exact inference framework related to adaptive rejection sampling. We also present a first implementation of that method along with experimental results shedding light on some fundamental issues. In hierarchical translation, inference needs to be performed over a highcomplexity distribution defined by the intersection of a translation hypergraph and a target language model. We replace this intractable distribution by a sequence of tractable upperbounds for which exact optimisers and samplers are easy to obtain. Our experiments show that exact inference is then feasible using only a fraction of the time and space that would be required by the full intersection, without recourse to pruning techniques that only provide approximate solutions. While the current implementation is limited in the size of inputs it can handle in reasonable time, our experiments provide insights towards obtaining future speedups, while staying in the same general framework. 1
Type
, 2009
"... Project funded by the European Community under the Seventh Framework Programme for Research and Technological Development. Project ref no. Project acronym Project full title ..."
Abstract
 Add to MetaCart
(Show Context)
Project funded by the European Community under the Seventh Framework Programme for Research and Technological Development. Project ref no. Project acronym Project full title
A Corpus Level MIRA Tuning Strategy for Machine Translation
"... MIRA based tuning methods have been widely used in statistical machine translation (SMT) system with a large number of features. Since the corpuslevel BLEU is not decomposable, these MIRA approaches usually define a variety of heuristicdriven sentencelevel BLEUs in their model losses. Instead, ..."
Abstract
 Add to MetaCart
(Show Context)
MIRA based tuning methods have been widely used in statistical machine translation (SMT) system with a large number of features. Since the corpuslevel BLEU is not decomposable, these MIRA approaches usually define a variety of heuristicdriven sentencelevel BLEUs in their model losses. Instead, we present a new MIRA method, which employs an exact corpuslevel BLEU to compute the model loss. Our method is simpler in implementation. Experiments on ChinesetoEnglish translation show its effectiveness over two stateoftheart MIRA implementations. 1
Spoken Language Systems
"... We present a conditionalrandomfield approach to discriminativelytrained phrasebased machine translation in which training and decoding are both cast in a sampling framework and are implemented uniformly in a new probabilistic programming language for factor graphs. In traditional phrasebased t ..."
Abstract
 Add to MetaCart
We present a conditionalrandomfield approach to discriminativelytrained phrasebased machine translation in which training and decoding are both cast in a sampling framework and are implemented uniformly in a new probabilistic programming language for factor graphs. In traditional phrasebased translation, decoding infers both a "Viterbi" alignment and the target sentence. In contrast, in our approach, a rich overlappingphrase alignment is produced by a fast deterministic method, while probabilistic decoding infers only the target sentence, which is then able to leverage arbitrary features of the entire source sentence, target sentence and alignment. By using SampleRank for learning we could in principle efficiently estimate hundreds of thousands of parameters. Testtime decoding is done by MCMC sampling with annealing. To demonstrate the potential of our approach we show preliminary experiments leveraging alignments that may contain overlapping biphrases. 1
Confidencebased Rewriting of Machine Translation Output
"... Numerous works in Statistical Machine Translation (SMT) have attempted to identify better translation hypotheses obtained by an initial decoding using an improved, but more costly scoring function. In this work, we introduce an approach that takes the hypotheses produced by a stateoftheart, rera ..."
Abstract
 Add to MetaCart
Numerous works in Statistical Machine Translation (SMT) have attempted to identify better translation hypotheses obtained by an initial decoding using an improved, but more costly scoring function. In this work, we introduce an approach that takes the hypotheses produced by a stateoftheart, reranked phrasebased SMT system, and explores new parts of the search space by applying rewriting rules selected on the basis of posterior phraselevel confidence. In the medical domain, we obtain a 1.9 BLEU improvement over a reranked baseline exploiting the same scoring function, corresponding to a 5.4 BLEU improvement over the original Moses baseline. We show that if an indication of which phrases require rewriting is provided, our automatic rewriting procedure yields an additional improvement of 1.5 BLEU. Various analyses, including a manual error analysis, further illustrate the good performance and potential for improvement of our approach in spite of its simplicity. 1