Results 1 - 10
of
38
An Introduction to Machine Translation
, 1992
"... Abstract. In the last ten years there has been a significant amount of research in Machine Translation within a “new ” paradigm of empirical approaches, often labelled collectively as “Example-based” approaches. The first manifestation of this approach caused some surprise and hostility among observ ..."
Abstract
-
Cited by 276 (7 self)
- Add to MetaCart
Abstract. In the last ten years there has been a significant amount of research in Machine Translation within a “new ” paradigm of empirical approaches, often labelled collectively as “Example-based” approaches. The first manifestation of this approach caused some surprise and hostility among observers more used to different ways of working, but the techniques were quickly adopted and adapted by many researchers, often creating hybrid systems. This paper reviews the various research efforts within this paradigm reported to date, and attempts a categorisation of different manifestations of the general approach.
Robust Large-Scale EBMT with Marker-Based Segmentation
- In Proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation (TMI-04
, 2004
"... Previous work on marker-based EBMT [Gough & Way, 2003, Way & Gough, 2004] suffered from problems such as data-sparseness and disparity between the training and test data. We have developed a largescale robust EBMT system. In a comparison with the systems listed in [Somers, 2003], ours is the third l ..."
Abstract
-
Cited by 26 (13 self)
- Add to MetaCart
Previous work on marker-based EBMT [Gough & Way, 2003, Way & Gough, 2004] suffered from problems such as data-sparseness and disparity between the training and test data. We have developed a largescale robust EBMT system. In a comparison with the systems listed in [Somers, 2003], ours is the third largest EBMT system and certainly the largest English-French EBMT system. Previous work used the on-line MT system Logomedia to translate source language material as a means of populating the system’s database where bitexts were unavailable. We derive our sententially aligned strings from a Sun Translation Memory (TM) and limit the integration of Logomedia to the derivation of our word-level lexicon. We also use Logomedia to provide a baseline comparison for our system and observe that we outperform Logomedia and previous marker-based EBMT systems in a number of tests. 1
Instructions and Descriptions: some cognitive aspects of programming and similar activities
, 2000
"... The Cognitive Dimensions framework outlined here is generalised broad-brush approach to usability evaluation for all types of information artifact, from programming languages through interactive systems to domestic devices. It also has promise of interfacing successfully with organisational and soci ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
The Cognitive Dimensions framework outlined here is generalised broad-brush approach to usability evaluation for all types of information artifact, from programming languages through interactive systems to domestic devices. It also has promise of interfacing successfully with organisational and sociological analyses. Keywords Usability evaluation, cognitive dimensions, notations, telephone, Prolog, spreadsheet, cognitive psychology. 1. INTRODUCTION We are living through a technological revolution, in which much research is necessarily dominated by immediate aims and short-term goals, and most research papers report some new accomplishment. The accomplishment may be useful but generalisations from one creation to another are very weak, unless the second is a direct descendant from the first. This paper is a contrast. Science-based engineering rests on idealisations (capacitance, gravity). Physical or chemical theory describing these idealisations is combined with experience and cra...
An Example-Based Approach to Translating Sign Language
- In Workshop Example-Based Machine Translation (MT X–05
, 2005
"... Users of sign languages are often forced to use a language in which they have reduced competence simply because documentation in their preferred format is not available. While some research exists on translating between natural and sign languages, we present here what we believe to be the first atte ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Users of sign languages are often forced to use a language in which they have reduced competence simply because documentation in their preferred format is not available. While some research exists on translating between natural and sign languages, we present here what we believe to be the first attempt to tackle this problem using an example-based (EBMT) approach. Having obtained a set of English–Dutch Sign Language examples, we employ an approach to EBMT using the ‘Marker Hypothesis ’ (Green, 1979), analogous to the successful system of (Way & Gough, 2003), (Gough & Way, 2004a) and (Gough & Way, 2004b). In a set of experiments, we show that encouragingly good translation quality may be obtained using such an approach. Key-words: Example-based machine translation, sign languages, Marker Hypothesis, ECHO corpus. 1
MATREX: DCU Machine Translation System for IWSLT 2006
"... In this paper, we give a description of the machine translation system developed at DCU that was used for our first participation in the evaluation campaign of the International Workshop on Spoken Language Translation (2006). This system combines two types of approaches. First, we use an EBMT approa ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
In this paper, we give a description of the machine translation system developed at DCU that was used for our first participation in the evaluation campaign of the International Workshop on Spoken Language Translation (2006). This system combines two types of approaches. First, we use an EBMT approach to collect aligned chunks based on two steps: deterministic chunking of both sides and chunk alignment. We use several chunking and alignment strategies. We also extract SMT-style aligned phrases, and the two types of resources are combined. We participated in the Open Data Track for the following translation directions: Arabic-English and Italian-English, for which we translated both the single-best ASR hypotheses and the text input. We report the results of the system for the provided evaluation sets.
wEBMT: Developing and Validating an Example-Based Machine Translation System using the World Wide Web
- COMPUTATIONAL LINGUISTICS
, 2003
"... ..."
Gaijin: A Bootstrapping, Template-Driven Approach to Example-Based MT
- In International Conference, Recent Advances in Natural Language Processing, Tzigov Chark
, 1997
"... Example-based Machine Translation (EBMT) is a recent approach to MT that offers robustness, scalability and graceful degradation, deriving as it does its competence not from explicit linguistic models of source and target languages, but from the wealth of bilingual corpora that are now avail ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Example-based Machine Translation (EBMT) is a recent approach to MT that offers robustness, scalability and graceful degradation, deriving as it does its competence not from explicit linguistic models of source and target languages, but from the wealth of bilingual corpora that are now available. Gaijin is such a system, employing statistical methods, string-matching, case-based reasoning and template-matching to provide a linguistics-lite EBMT solution. The only linguistics employed by Gaijin is a psycholinguistic constraintthe marker hypothesisthat is minimal, simple to apply, and arguably universal. The scope and current state of Gaijin is described, and some initial evaluation results are reported.
Controlled Generation in Example-Based Machine Translation
- In Proceedings of the Ninth Machine Translation Summit (MT Summit IX
, 2003
"... The theme of controlled translation is currently in vogue in the area of MT. Recent research (Sch aler et al., 2003; Carl, 2003) hypothesises that EBMT systems are perhaps best suited to this challenging task. In this paper, we present an EBMT system where the generation of the target string is fi ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
The theme of controlled translation is currently in vogue in the area of MT. Recent research (Sch aler et al., 2003; Carl, 2003) hypothesises that EBMT systems are perhaps best suited to this challenging task. In this paper, we present an EBMT system where the generation of the target string is filtered by data written according to controlled language specifications. As far as we are aware, this is the only research available on this topic. In the field of controlled language applications, it is more usual to constrain the source language in this way rather than the target. We translate a small corpus of controlled English into French using the on-line MT system Logomedia, and seed the memories of our EBMT system with a set of automatically induced lexical resources using the Marker Hypothesis as a segmentation tool. We test our system on a large set of sentences extracted from a Sun Translation Memory, and provide both an automatic and a human evaluation. For comparative purposes, we also provide results for Logomedia itself.
A memory-based classification approach to marker-based EBMT
- Proceedings of the METIS-II Workshop on New Approaches to Machine Translation
, 2007
"... We describe a novel approach to examplebased machine translation that makes use of marker-based chunks, in which the decoder is a memory-based classifier. The classifier is trained to map trigrams of source-language chunks onto trigrams of target-language chunks; then, in a second decoding step, the ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
We describe a novel approach to examplebased machine translation that makes use of marker-based chunks, in which the decoder is a memory-based classifier. The classifier is trained to map trigrams of source-language chunks onto trigrams of target-language chunks; then, in a second decoding step, the predicted trigrams are rearranged according to their overlap. We present the first results of this method on a Dutch-to-English translation system using Europarl data. Sparseness of the class space causes the results to lag behind a baseline phrase-based SMT system. In a further comparison, we also apply the method to a word-aligned version of the same data, and report a smaller difference with a word-based SMT system. We explore the scaling abilities of the memory-based approach, and observe linear scaling behavior in training and classification speed and memory costs, and loglinear BLEU improvements in the amount of training examples. 1
MATREX: The DCU MT System for WMT 2009
"... In this paper, we describe the machine translation system in the evaluation campaign of the Fourth Workshop on Statistical Machine Translation at EACL 2009. We describe the modular design of our multiengine MT system with particular focus on the components used in this participation. We participated ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
In this paper, we describe the machine translation system in the evaluation campaign of the Fourth Workshop on Statistical Machine Translation at EACL 2009. We describe the modular design of our multiengine MT system with particular focus on the components used in this participation. We participated in the translation task for the following translation directions: French– English and English–French, in which we employed our multi-engine architecture to translate. We also participated in the system combination task which was carried out by the MBR decoder and Confusion Network decoder. We report results on the provided development and test sets. 1

