Results 1 - 10
of
12
Inducing Translation Templates for Example-Based Machine Translation
- In MTSummit VII
, 1999
"... This paper describes an example-based machine translation (EBMT) system which relays on various knowledge resources. Morphologic analyses abstract the surface forms of the languages to be translated. A shallow syntactic rule formalism is used to percolate features in derivation trees. Translation ex ..."
Abstract
-
Cited by 26 (7 self)
- Add to MetaCart
This paper describes an example-based machine translation (EBMT) system which relays on various knowledge resources. Morphologic analyses abstract the surface forms of the languages to be translated. A shallow syntactic rule formalism is used to percolate features in derivation trees. Translation examples serve the decomposition of the text to be translated and determine the transfer of lexical values into the target language. Translation templates determine the word order of the target language and the type of phrases (e.g. noun phrase, prepositional phase, ...) to be generated in the target language. An induction mechanism generalizes translation templates from translation examples. The paper outlines the basic idea underlying the EBMT system and investigates the possibilities and limits of the translation template induction process.
A Full-Text Experiment in Example-Based Machine Translation
- In Proceedings of the International Conference on New Methods in Language Processing
, 1994
"... This paper describes an experiment in examplebased machine translation (EBMT) on full text. The unit of translation is a text chunk of arbitrary length, in contrast to sentence-level EBMT experiments. Intra- and inter-language matching techniques and metrics used in the experiment are described. Key ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
This paper describes an experiment in examplebased machine translation (EBMT) on full text. The unit of translation is a text chunk of arbitrary length, in contrast to sentence-level EBMT experiments. Intra- and inter-language matching techniques and metrics used in the experiment are described. Keywords: Example-based MT, Corpus-based NLP Introduction The growth rate of theoretical studies of language structure and use stubbornly remains higher than the improvement rate of large-scale applications. It has been repeatedly proved that large-scale realistic NLP applications carry a prohibitive price tag of large-scale, routine acquisition of knowledge about language and about the world, collected in computational grammars, lexicons and domain models. Strategically, there are several ways of dealing with this problem: ffl biting the bullet and going through a massive knowledge acquisition effort, either generalpurpose (e.g., the CYC project, Lenat et al., 1990) or domain-specific (e.g....
Two Approaches to Matching in Example-Based Machine Translation
- In Proc. of the 5th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI-93
, 1993
"... This paper describes two approaches to matching input strings with strings from a translation archive in the example-based machine translation paradigm- the more canonical "chunking + matching + recombination " method and an alternative method of matching at the level of complete sentences ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
This paper describes two approaches to matching input strings with strings from a translation archive in the example-based machine translation paradigm- the more canonical "chunking + matching + recombination " method and an alternative method of matching at the level of complete sentences. The latter produces less exact matches while the former suffers from (often serious) translation quality lapses at the boundaries of recombined chunks. A set of text matching criteria was selected to reflect the trade-off between utility and computational price of each criterion. A metric for comparing text passages was devised and calibrated with the help of a specially constructed diagnostic example set. A partitioning algorithm was developed for finding an optimum "cover " of an input string by a set of best-matching shorter chunks. The results were evaluated in a monolingual setting using an existing MT post-editing tool: the distance between the input and its best match in the archive was calculated in terms of the number of keystrokes necessary to reduce the latter to the former. As a result, the metric was adjusted and an experiment was run to test the two EBMT methods, both on the training corpus and on the working corpus (or "archive") of some 6,500 sentences. 47 The growth rate of theoretical studies of language structure and use stubbornly remains higher than the improvement
A Survey of Current Paradigms in Machine Translation
"... This paper is a survey of the current machine translation research in the US, Europe and Japan. A short history of machine translation is presented first, followed by an overview of the current research work. Representative examples of a wide range of different approaches adopted by machine tran ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This paper is a survey of the current machine translation research in the US, Europe and Japan. A short history of machine translation is presented first, followed by an overview of the current research work. Representative examples of a wide range of different approaches adopted by machine translation researchers are presented. These are described in detail along with a discussion of the practicalities of scaling up these approaches for operational environments. In support of this discussion, issues in, and techniques for, evaluating machine translation systems are addressed.
An example-based disambiguation of prepositional phrase attachment
- CI ��� %@� C IA@� CIA x CI H �Y � x0I I Y � aHY H � x 0 I x P x %@P� CIA� ™�� Y I � x 0 I Px H
, 1993
"... Spoken language translation is a challenging new application that differs from written language translation in several ways, for instance, 1) human intervention (pre-edit or post-edit) should be avoided; 2) a real-time response is desirable for success. Example-based approaches meet these requiremen ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Spoken language translation is a challenging new application that differs from written language translation in several ways, for instance, 1) human intervention (pre-edit or post-edit) should be avoided; 2) a real-time response is desirable for success. Example-based approaches meet these requirements, that is, they realize accurate structural disambiguation and target word selection, and respond quickly. This paper concentrates on structural disambiguation, particularly English prepositional. phrase attachment (pp-attachment). Usually, a pp-attachment is hard to determine by syntactic analysis alone and many candidates remain. In machine translation, if a pp-attachment is not likely, the translation of the preposition, indeed, the whole translation, is not likely. In order to select the most likely attachment from many candidates, various methods have been proposed. This paper proposes a new method, Example-Based Disambiguation (EBD) of pp-attachment, which 1) collects examples (prepositional phrase-attachment pairs) from a corpus; 2) computes the semantic distance between an input expression and examples; 3) selects the most likely attachment based on the minimum-distance examples. Through experiments contrasting EBD and conventional methods, the authors show the EBD's superiority from the standpoint of success rates. 1
Latest developments in machine translation technology
- In: MT Summit
, 1993
"... which had been established in the late 1970s. These were the systems which had built upon experi-ence gained in what may be called the 'quiet ' decade of machine translation, the ten years after the publication of the ALPAC report in 1966 had brought to an end MT research in the United States and ha ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
which had been established in the late 1970s. These were the systems which had built upon experi-ence gained in what may be called the 'quiet ' decade of machine translation, the ten years after the publication of the ALPAC report in 1966 had brought to an end MT research in the United States and had profoundly affected its support elsewhere. Throughout the 1980s, it can be asserted without contradiction, the dominant framework of MT research was the essentially syntax-oriented 'transfer ' approach exemplified by such systems as ARIANE at Grenoble University, METAL at Texas, SUSY at Saarbrücken, the Mu system at Kyoto University, and of course the multilingual Eurotra project of the European Communities. In addition, many of the commercial systems which appeared at this time were based on the same principles. For some time it appeared as if the 'interlingua ' approach was not viable. Earlier efforts in the 1970s had been unsuccessful at Grenoble- the CETA system- and at the University of Texas. These were, however, essentially syntax-oriented approaches: while structural transfer was via interlingual ('universal') tree representations, lexical transfer was still via bilingual dictionary substitution. Dur-ing the 1980s, new approaches to the interlingua model appeared. Some remained essentially lin-
Towards a Model of Competence for Corpus-Based Machine Translation
- IAI Working Papers
, 1999
"... A translation is a conversion from a source language into a target language preserving the meaning. A huge number of techniques and computational approaches have been experimented in order to translate natural languages automatically, yet no satisfactory solution has been found. This paper examines ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
A translation is a conversion from a source language into a target language preserving the meaning. A huge number of techniques and computational approaches have been experimented in order to translate natural languages automatically, yet no satisfactory solution has been found. This paper examines approaches to corpus-based machine translation (CBMT). In CBMT, a set of reference example translations is given to the MT system. These are analyzed and compiled into the system's internal representation according to the theory of meaning the system implements. The representations, then, serve as a basis to translate new sentences. This paper discusses three main approaches in the CBMT paradigm: the memory-based approach (e.g. translation memories (TM)), the example-based approach (EBMT) and the statistical-based approach (SBMT). Concrete CBMT systems are discussed in light of the theory of meaning (preservation) they implement. This discussion, then leads to a model of competence for CBMT systems. The paper concludes that CBMT systems can be designed to achieve high reliability or broad coverage, though both seem to be mutually exclusive qualities.
Confidence Factor Assignment to Translation Templates
, 1998
"... that I have read this thesis and that in my opinion it is fully adequate, in scope ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
that I have read this thesis and that in my opinion it is fully adequate, in scope

