Results 1 -
3 of
3
N-Gram Posterior Probabilities for Statistical Machine Translation
- TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (HLT-NAACL): PROC. OF THE WORKSHOP ON STATISTICAL MACHINE TRANSLATION
, 2006
"... Word posterior probabilities are a common approach for confidence estimation in automatic speech recognition and machine translation. We will generalize this idea and introduce n-gram posterior probabilities and show how these can be used to improve translation quality. Additionally, we will ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Word posterior probabilities are a common approach for confidence estimation in automatic speech recognition and machine translation. We will generalize this idea and introduce n-gram posterior probabilities and show how these can be used to improve translation quality. Additionally, we will introduce a sentence length model based on posterior probabilities. We will show
Active learning for interactive machine translation
"... Translation needs have greatly increased during the last years. In many situations, text to be translated constitutes an unbounded stream of data that grows continually with time. An effective approach to translate text documents is to follow an interactive-predictive paradigm in which both the syst ..."
Abstract
- Add to MetaCart
Translation needs have greatly increased during the last years. In many situations, text to be translated constitutes an unbounded stream of data that grows continually with time. An effective approach to translate text documents is to follow an interactive-predictive paradigm in which both the system is guided by the user and the user is assisted by the system to generate error-free translations. Unfortunately, when processing such unbounded data streams even this approach requires an overwhelming amount of manpower. Is in this scenario where the use of active learning techniques is compelling. In this work, we propose different active learning techniques for interactive machine translation. Results show that for a given translation quality the use of active learning allows us to greatly reduce the human effort required to translate the sentences in the stream. 1
Word Level Confidence Estimation for . . .
, 2011
"... Although Statistical Machine Translation (SMT) systems are being used in several real world applications, there is no efficient way to determine the correctness of their output. Therefore, the goal of this project was to generate word-level confidence estimates for an SMT output. These confidence es ..."
Abstract
- Add to MetaCart
Although Statistical Machine Translation (SMT) systems are being used in several real world applications, there is no efficient way to determine the correctness of their output. Therefore, the goal of this project was to generate word-level confidence estimates for an SMT output. These confidence estimates were used for the detection of erroneous words. Employing a supervised machine learning approach towards confidence estimation, we used different binary classification techniques in order to classify each word in the SMT output as correct or incorrect. For classification of a translated word, we formulated sixteen numerical features based on different information sources such as the underlying SMT system, morphological characteristics of the source, target languages etc. We experimentally assessed the performance of each feature on the task of word-level confidence estimation. Further, we evaluated the performance of different combinations of features for the prediction of confidence estimates. We also compared the performance of three classification algorithms (Naive Bayes, Decision Trees, Multi-Layer Perceptron) for confidence estimation at word-level. In summary, this project provides a framework for the generation of reliable confidence estimates

