## Josef van Genabith

### Cached

### Download Links

### BibTeX

@MISC{He_josefvan,

author = {Yifan He and Yanjun Ma and Johann Roturier},

title = {Josef van Genabith},

year = {}

}

### OpenURL

### Abstract

We report findings from a user study with professional post-editors using a translation recommendation framework (He et al., 2010) to integrate Statistical Machine Translation (SMT) output with Translation Memory (TM) systems. The framework recommends SMT outputs to a TM user when it predicts that SMT outputs are more suitable for postediting than the hits provided by the TM. We analyze the effectiveness of the model as well as the reaction of potential users. Based on the performance statistics and the users’ comments, we find that translation recommendation can reduce the workload of professional post-editors and improve the acceptance of MT in the localization industry. 1

### Citations

2319 | Support Vector Networks
- Cortes, Vapnik
- 1995
(Show Context)
Citation Context ...stimate (the upper bound of) post-editing labour. (He et al., 2010) recast translation recommendation as a binary classification (rather than regression) problem using Support Vector Machines(SVMs: (=-=Cortes and Vapnik, 1995-=-)) max-margin binary classifiers, perform Radial Basis Function (RBF) kernel parameter optimization to find the optimal meta-parameters for the classifier, employ posterior probability-based confidenc... |

1285 |
Binary codes capable of correcting deletions, insertions and reversals
- Levenshtein
- 1966
(Show Context)
Citation Context ... The calculation of fuzzy match score itself is one of the core technologies in TM systems and varies among different vendors. (He et al., 2010) compute fuzzy match cost as the minimum Edit Distance (=-=Levenshtein, 1966-=-) between the source and TM entry, normalized by the length of the source as in (4), as most of the current implementations are based on edit distance while allowing some additional flexible matching.... |

1277 |
A coefficient of agreement for nominal scales
- Cohen
- 1960
(Show Context)
Citation Context ...ion results, we computed the inter-rater agreement measured by Fleiss’ Kappa coefficient (Fleiss, 1981) which can assess the agreement between multiple raters as opposed to Cohen’s Kappa coefficient (=-=Cohen, 1960-=-) which works with two raters. Fleiss’ Kappa coefficient for our five post-editors is 0.464 ± 0.024, indicating a moderate agreement. We also obtained Fleiss’ Kappa coefficient for each category as sh... |

1226 | The mathematics of statistical machine translation: Parameter estimation
- Brown, Pietra, et al.
- 1993
(Show Context)
Citation Context ...Fuzzy Match Score: they translate the output back to obtain a pseudo source sentence. They compute the fuzzy match score between the original source sentence and this pseudo-source • The IBM Model 1 (=-=Brown et al., 1993-=-) scores in both directions 4 Evaluation Methodology We conduct a human evaluation on TM–MT integration with professional post-editors. In this section we introduce the evaluation data we use, the pos... |

953 | Open source toolkit for statistical machine translation
- Koehn, Hoang, et al.
- 2007
(Show Context)
Citation Context ...the performance of MT, but is less relevant to our task in this paper. Moreover, (Koehn and Haddow, 2009) presents a post-editing environment using information from the phrase-based SMT system Moses (=-=Koehn et al., 2007-=-), instead of the fuzzy match information from TMs. Although all these approaches try to tackle the TM–MT integration task from different perspectives, we concentrate on evaluating the method of (He e... |

747 | Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods
- Platt
- 1999
(Show Context)
Citation Context ... other. What is more preferable is a probabilistic confidence score (e.g. 90% confidence) which is better understood by post-editors and translators. (He et al., 2010) use the techniques proposed by (=-=Platt, 1999-=-) and improved by (Lin et al., 2007) to obtain the posterior probability of a classification as a recommendation confidence score. Platt’s method estimates the posterior probability with a sigmoid fun... |

397 |
Statistical Methods for Rates and Proportions
- Fleiss
- 1981
(Show Context)
Citation Context ...puts to post-edit than the other options.6.2 Inter-annotator Agreement To gauge the validity of human evaluation results, we computed the inter-rater agreement measured by Fleiss’ Kappa coefficient (=-=Fleiss, 1981-=-) which can assess the agreement between multiple raters as opposed to Cohen’s Kappa coefficient (Cohen, 1960) which works with two raters. Fleiss’ Kappa coefficient for our five post-editors is 0.464... |

393 | Discriminative training and maximum entropy models for statistical machine translation
- Och, Ney
- 2002
(Show Context)
Citation Context ...idation on these 8K sentence pairs, and randomly select 300 from the cross validation test sets for human evaluation. More specifically, for the SMT system, we use a standard log-linear PB-SMT model (=-=Och and Ney, 2002-=-): GIZA++ implementation of IBM word alignment model 4, the refinement and phrase-extraction heuristics described in (Koehn et al., 2003), minimum-error-rate training (Och, 2003), a 5-gram language mo... |

393 | A study of translation edit rate with targeted human annotation
- Snover, Dorr, et al.
- 2006
(Show Context)
Citation Context ...ic MT evaluation metrics to simulate post-editing effort. However, the evaluation in (He et al., 2010) suffers from lack of human-annotated data. Instead they use the TER automatic evaluation metric (=-=Snover et al., 2006-=-) to approximate human judgement. Despite the fact that the correlations between automatic evaluation metrics and human judgements are improving, professional post-editors are the ones that hold the f... |

290 |
Improved backing-off for m-gram language modeling
- Kneser, Ney
- 1995
(Show Context)
Citation Context ... word alignment model 4, the refinement and phrase-extraction heuristics described in (Koehn et al., 2003), minimum-error-rate training (Och, 2003), a 5-gram language model with Kneser-Ney smoothing (=-=Kneser and Ney, 1995-=-) trained with SRILM (Stolcke, 2002) on the target side of the training data, and Moses (Koehn et al., 2007) to decode. For the translation recommendation model, we output a confidence level using the... |

178 | phrase-based translation - Statistical |

116 | A note on Platt’s probabilistic outputs for support vector machines.Mach
- LIN, LIN, et al.
- 2003
(Show Context)
Citation Context ...e is a probabilistic confidence score (e.g. 90% confidence) which is better understood by post-editors and translators. (He et al., 2010) use the techniques proposed by (Platt, 1999) and improved by (=-=Lin et al., 2007-=-) to obtain the posterior probability of a classification as a recommendation confidence score. Platt’s method estimates the posterior probability with a sigmoid function, as in (2): P r(y = 1|x) ≈ PA... |

89 | creative: Evaluating translation quality using Amazon’s Mechanical Turk - Fast |

9 |
Fuzzy matching in theory and practice
- Sikes
- 2007
(Show Context)
Citation Context ...TMs are useful as long as they are maintained; 2) TMs represent considerable effort and investment by a company or (even more so) an individual translator; 3) translators accept that the fuzzy match (=-=Sikes, 2007-=-) score used in TMs offers a good approximation of post-editing effort, which is useful for translation cost estimation; 4) translators are used to working with TMs and using something else could pote... |

5 | Improving the Confidence of Machine Translation Quality Estimates - 2009b |

1 | Bridging TM and SMT with translation recommendation - He, Ma, et al. - 2010 |

1 | Estimating the sentence-level quality of machine translation systems - 2009a |