Results 1 
6 of
6
A discriminative latent variable model for statistical machine translation
 In Proc. of the 46th Annual Conference of the Association for Computational Linguistics: Human Language Technologies (ACL08:HLT
, 2008
"... Largescale discriminative machine translation promises to further the stateoftheart, but has failed to deliver convincing gains over current heuristic frequency count systems. We argue that a principle reason for this failure is not dealing with multiple, equivalent translations. We present a tr ..."
Abstract

Cited by 52 (4 self)
 Add to MetaCart
(Show Context)
Largescale discriminative machine translation promises to further the stateoftheart, but has failed to deliver convincing gains over current heuristic frequency count systems. We argue that a principle reason for this failure is not dealing with multiple, equivalent translations. We present a translation model which models derivations as a latent variable, in both training and decoding, and is fully discriminative and globally optimised. Results show that accounting for multiple derivations does indeed improve performance. Additionally, we show that regularisation is essential for maximum conditional likelihood models in order to avoid degenerate solutions. 1
Determinization of weighted tree automata using factorizations
 PRESENTATION AT 8TH INT. WORKSHOP FINITESTATE METHODS AND NATURAL LANGUAGE PROCESSING
, 2009
"... We present a determinization construction for weighted tree automata using factorizations. Among others, this result subsumes a previous result for determinization of weighted string automata using factorizations (Kirsten and Mäurer, 2005) and two previous results for weighted tree automata, one of ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
We present a determinization construction for weighted tree automata using factorizations. Among others, this result subsumes a previous result for determinization of weighted string automata using factorizations (Kirsten and Mäurer, 2005) and two previous results for weighted tree automata, one of them not using factorizations (Borchardt, 2004) and one of them restricted to nonrecursive automata over the nonnegative reals (May and Knight, 2006).
A BeamSearch Extraction Algorithm for Comparable Data
"... This paper extends previous work on extracting parallel sentence pairs from comparable data (Munteanu and Marcu, 2005). For a given source sentence S, a maximum entropy (ME) classifier is applied to a large set of candidate target translations. A beamsearch algorithm is used to abandon target sente ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
This paper extends previous work on extracting parallel sentence pairs from comparable data (Munteanu and Marcu, 2005). For a given source sentence S, a maximum entropy (ME) classifier is applied to a large set of candidate target translations. A beamsearch algorithm is used to abandon target sentences as nonparallel early on during classification if they fall outside the beam. This way, our novel algorithm avoids any documentlevel prefiltering step. The algorithm increases the number of extracted parallel sentence pairs significantly, which leads to a BLEU improvement of about 1 % on our SpanishEnglish data. 1
A RuleDriven Dynamic Programming Decoder for Statistical MT
"... The paper presents an extension of a dynamic programming (DP) decoder for phrasebased SMT (Koehn, 2004; Och and Ney, 2004) that tightly integrates POSbased reorder rules (Crego and Marino, 2006) into a lefttoright beamsearch algorithm, rather than handling them in a preprocessing or reorder ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
The paper presents an extension of a dynamic programming (DP) decoder for phrasebased SMT (Koehn, 2004; Och and Ney, 2004) that tightly integrates POSbased reorder rules (Crego and Marino, 2006) into a lefttoright beamsearch algorithm, rather than handling them in a preprocessing or reorder graph generation step. The novel decoding algorithm can handle tens of thousands of rules efficiently. An improvement over a standard phrasebased decoder is shown on an ArabicEnglish translation task with respect to translation accuracy and speed for large reorder window sizes. 1
Sinuhe — Statistical Machine Translation using a Globally Trained Conditional Exponential Family Translation Model
"... We present a new phrasebased conditional exponential family translation model for statistical machine translation. The model operates on a feature representation in which sentence level translations are represented by enumerating all the known phrase level translations that occur inside them. This ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We present a new phrasebased conditional exponential family translation model for statistical machine translation. The model operates on a feature representation in which sentence level translations are represented by enumerating all the known phrase level translations that occur inside them. This makes the model a good match with the commonly used phrase extraction heuristics. The model’s predictions are properly normalized probabilities. In addition, the model automatically takes into account information provided by phrase overlaps, and does not suffer from reference translation reachability problems. We have implemented an open source translation system Sinuhe based on the proposed translation model. Our experiments on Europarl and GigaFrEn corpora demonstrate that finding the unique MAP parameters for the model on large scale data is feasible with simple stochastic gradient methods. Sinuhe is fast and memory efficient, and the BLEU scores obtained by it are only slightly inferior to those of Moses. 1
TwoNeighbor Orientation Model with CrossBoundary Global Contexts
"... Long distance reordering remains one of the greatest challenges in statistical machine translation research as the key contextual information may well be beyond the confine of translation units. In this paper, we propose TwoNeighbor Orientation (TNO) model that jointly models the orientation decisi ..."
Abstract
 Add to MetaCart
Long distance reordering remains one of the greatest challenges in statistical machine translation research as the key contextual information may well be beyond the confine of translation units. In this paper, we propose TwoNeighbor Orientation (TNO) model that jointly models the orientation decisions between anchors and two neighboring multiunit chunks which may cross phrase or rule boundaries. We explicitly model the longest span of such chunks, referred to as Maximal Orientation Span, to serve as a global parameter that constrains underlying local decisions. We integrate our proposed model into a stateoftheart stringtodependency translation system and demonstrate the efficacy of our proposal in a largescale ChinesetoEnglish translation task. On NIST MT08 set, our most advanced model brings around +2.0 BLEU and1.0 TER improvement.