Results 1 - 10
of
24
A Word-to-Word Model of Translational Equivalence
, 1997
"... Many multilingual NLP applications need to translate words between different languages, but cannot afford the computational expense of inducing or applying a full translation model. For these applications, we have designed a fast algorithm for estimating a partial translation model, which accounts f ..."
Abstract
-
Cited by 73 (6 self)
- Add to MetaCart
Many multilingual NLP applications need to translate words between different languages, but cannot afford the computational expense of inducing or applying a full translation model. For these applications, we have designed a fast algorithm for estimating a partial translation model, which accounts for translational equivalence only at the word level . The model's precision /recall trade-off can be directly controlled via one threshold parameter. This feature makes the model more suitable for applications that are not fully statistical. The model's hidden parameters can be easily conditioned on information extrinsic to the model, providing an easy way to integrate pre-existing knowledge such as part-of-speech, dictionaries, word order, etc.. Our model can link word tokens in parallel texts as well as other translation models in the literature. Unlike other translation models, it can automatically produce dictionarysized translation lexicons, and it can do so with over 99% accuracy.
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
User-Friendly Text Prediction for Translators
- IN PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP
, 2002
"... Text prediction is a form of interactive machine translation that is well suited to skilled translators. In principle it can assist in the production of a target text with minimal disruption to a translator's normal routine. However, recent evaluations of a prototype prediction system showed ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
Text prediction is a form of interactive machine translation that is well suited to skilled translators. In principle it can assist in the production of a target text with minimal disruption to a translator's normal routine. However, recent evaluations of a prototype prediction system showed that it significantly decreased the productivity of most translators who used it. In this paper, we analyze the reasons for this and propose a solution which consists in seeking predictions that maximize the expected benefit to the translator, rather than just trying to anticipate some amount of upcoming text. Using a model of a "typical translator" constructed from data collected in the evaluations of the prediction prototype, we show that this approach has the potential to turn text prediction into a help rather than a hindrance to a translator.
Efficient Language Independent Generation from Lexical Conceptual Structures
- MACHINE TRANSLATION
, 2002
"... This paper describes a system for generating natural-language sentences from an interlingual representation, Lexical Conceptual Structure (LCS). The system has been developed as part of a Chinese-English Machine Translation system; however, it is designed to be used for many other MT language pairs ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
This paper describes a system for generating natural-language sentences from an interlingual representation, Lexical Conceptual Structure (LCS). The system has been developed as part of a Chinese-English Machine Translation system; however, it is designed to be used for many other MT language pairs and Natural Language applications. The contributions of this work include: (1) Development of a language-independent generation system that maximizes efficiency through the use of a hybrid rule-based/statistical module; (2) Enhancements to an interlingual representation and associated algorithms for interpretation of multiply ambiguous input sentences; (3) Development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems; (4) Improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions; (5) Development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of translation quality shows comparable performance with a commercial translation system. The generation system can also be straightforwardly extended to other languages and this is demonstrated and evaluated for Spanish.
TransType: Text Prediction for Translators
- IN IN PROCEEDINGS OF THE 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. DEMONSTRATION DESCRIPTION
, 2002
"... Text prediction is a novel form of interactive machine translation that is well suited to skilled translators. It has the potential to assist in several ways: speeding typing, suggesting possible translations, and averting translator errors. However, recent evaluations of a prototype prediction syst ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Text prediction is a novel form of interactive machine translation that is well suited to skilled translators. It has the potential to assist in several ways: speeding typing, suggesting possible translations, and averting translator errors. However, recent evaluations of a prototype prediction system showed that predictions can also distract and hinder translators if made indiscriminately. We demonstrate an experimental prototype intended to address this problem by selecting the prediction that has maximal expected benefit to the user in any given context. This leads it to make longer predictions where it is more certain and shorter ones---or none at all---in contexts where it is less certain.
Retrospect and Prospect in Computer-Based Translation
, 1999
"... At the last MT Summit conference this century, this paper looks back briefly at what has happened in the 50 years since MT began, reviews the present situation, and speculates on what the future may bring. Progress in the basic processes of computerized translation has not been as dramatic as ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
At the last MT Summit conference this century, this paper looks back briefly at what has happened in the 50 years since MT began, reviews the present situation, and speculates on what the future may bring. Progress in the basic processes of computerized translation has not been as dramatic as developments in computer technology and software. There is still much scope for the improvement of the linguistic quality of MT output, which hopefully developments in both rule-based and corpus-based methods can bring. Greater
Scaling the ISLE framework: Use of existing corpus resources for validation of MT evaluation metrics across languages
- In Proceedings of LREC 2002. Las Plamas, Canary Islands
, 2002
"... This paper describes a machine translation (MT) evaluation (MTE) research program which has benefited from the availability of two collections of source language texts and the results of processing these texts with several commercial MT engines (DARPA 1994, Doyon, Taylor, & White 1999). The methodo ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper describes a machine translation (MT) evaluation (MTE) research program which has benefited from the availability of two collections of source language texts and the results of processing these texts with several commercial MT engines (DARPA 1994, Doyon, Taylor, & White 1999). The methodology entails the systematic development of a predictive relationship between discrete, well-defined MTE metrics and specific information processing tasks that can be reliably performed with output of a given MT system. Unlike tests used in initial experiments on automated scoring (Jones and Rusk 2000), we employ traditional measures of MT output quality, selected from the International Standards for Language Engineering (ISLE) framework: Coherence, Clarity, Syntax, Morphology, General and Domain-specific Lexical robustness, to include Named-entity translation. Each test was originally validated on MT output produced by three Spanish-to-English systems (1994 DARPA MTE). We validate tests in the present work, however, with material taken from the MT Scale Evaluation research program produced by Japanese-to-English MT systems. Since Spanish and Japanese differ structurally on the morphological, syntactic, and discourse levels, a comparison of scores on tests measuring these output qualities should reveal how structural similarity, such as that enjoyed by Spanish and English, and structural contrast, such as that found between Japanese and English, affect the linguistic distinctions which must be accommodated by MT systems. Moreover, we show that metrics developed using Spanish-English MT output are equally effective when applied to Japanese-English MT output. 1.
Building a dynamic lexicon from a digital library
- in JCDL ’08: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries (ACM
, 2008
"... We describe here in detail our work toward creating a dynamic lexicon from the texts in a large digital library. By leveraging a small structured knowledge source (a 30,457 word treebank), we are able to extract selectional preferences for words from a 3.5 million word Latin corpus. This is promisin ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We describe here in detail our work toward creating a dynamic lexicon from the texts in a large digital library. By leveraging a small structured knowledge source (a 30,457 word treebank), we are able to extract selectional preferences for words from a 3.5 million word Latin corpus. This is promising news for low-resource languages and digital collections seeking to leverage a small human investment into much larger gain. The library architecture in which this work is developed allows us to query customized subcorpora to report on lexical usage by author, genre or era and allows us to continually update the lexicon as new texts are added to the collection.
Evaluation of MT Systems: A Programmatic View
- Machine Translation
, 1993
"... Most MT systems seem to aim at simulation of the behaviour of human translators, although this is not very often stated as explicitly as in e.g. [JW87, page 136]). Like human translators they translate texts normally not written with the specific goal of translation in mind and just as in the case o ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Most MT systems seem to aim at simulation of the behaviour of human translators, although this is not very often stated as explicitly as in e.g. [JW87, page 136]). Like human translators they translate texts normally not written with the specific goal of translation in mind and just as in the case of human translators their translations will normally be revised prior to distribution to the end user. This predominance of what we will call the Human Translator Metaphor has – as a natural consequence – led to a view of MT evaluation where the main purpose of evaluation is to determine to what extent the makers of a system have succeeded in mimicking the human translator. Yet it has to be noted that this interpretation of the notion of MT system is the one least likely to be successful over the next 10 or 20 years. There has been no real breakthrough over the last decade, and there is little evidence that the next 10 years will present us with a revolution in MT. It is very tempting to object that progress has been made, and that
Scaling the ISLE Framework: Validating Tests of Machine Translation Quality for Multi-Dimensional Measurement
- In: Proceedings of the Fourth ISLE Evaluation Workshop, MT Summit VIII
, 2001
"... Work on comparing a set of linguistic test scores for MT output to a set of the same tests' scores for naturally-occurring target language text (Jones and Rusk 2000) broke new ground in automating MT Evaluation. However, the tests used were selected on an ad hoc basis. In this paper, we report on wo ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Work on comparing a set of linguistic test scores for MT output to a set of the same tests' scores for naturally-occurring target language text (Jones and Rusk 2000) broke new ground in automating MT Evaluation. However, the tests used were selected on an ad hoc basis. In this paper, we report on work to extend our understanding, through refinement and validation, of suitable linguistic tests in the context of our novel approach to MTE. This approach was introduced in Miller and Vanni (2001a) and employs standard, rather than randomly-chosen, tests of MT output quality selected from the ISLE framework as well as a scoring system for predicting the type of information processing task performable with the output. Since the intent is to automate the scoring system, this work can also be viewed as the preliminary steps of algorithm design.

