Results 1 -
9 of
9
Translation Selection for Japanese-English Noun-Noun Compounds
, 2003
"... We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over fullyspecified and partial translation data, based on corpus evidence. In evaluation, we demonstrate th ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We present a method for compositionally translating Japanese NN compounds into English, using a wordlevel transfer dictionary and target language monolingual corpus. The method interpolates over fullyspecified and partial translation data, based on corpus evidence. In evaluation, we demonstrate that interpolation over the two data types is superior to using either one, and show that our method performs at an F-score of 0.68 over translation-aligned inputs and 0.66 over a random sample of 500 NN compounds.
Automatic Extraction of Chinese Multiword Expressions with a Statistical Tool
"... In this paper, we report on our experiment to extract Chinese multiword expressions from corpus resources as part of a larger research effort to improve a machine translation (MT) system. For existing MT systems, the issue of multiword expression (MWE) identification and accurate interpretation from ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this paper, we report on our experiment to extract Chinese multiword expressions from corpus resources as part of a larger research effort to improve a machine translation (MT) system. For existing MT systems, the issue of multiword expression (MWE) identification and accurate interpretation from source to target language remains an unsolved problem. Our initial test on the Chineseto-English translation functions of Systran and CCID’s Huan-Yu-Tong MT systems reveal that, where MWEs are involved, MT tools suffer in terms of both comprehensibility and adequacy of the translated texts. For MT systems to become of further practical use, they need to be enhanced with MWE processing capability. As part of our study towards this goal, we test and evaluate a statistical tool, which was developed for English, for identifying and extracting Chinese MWEs. In our evaluation, the tool achieved precisions ranging from 61.16% to 93.96 % for different types of MWEs. Such results demonstrate that it is feasible to automatically identify many Chinese MWEs using our tool, although it needs further improvement. 1
SemEval-2010 Task 9: The Interpretation of Noun Compounds Using Paraphrasing Verbs and Prepositions
"... We present a brief overview of the main challenges in understanding the semantics of noun compounds and consider some known methods. We introduce a new task to be part of SemEval-2010: the interpretation of noun compounds using paraphrasing verbs and prepositions. The task is meant to provide a stan ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We present a brief overview of the main challenges in understanding the semantics of noun compounds and consider some known methods. We introduce a new task to be part of SemEval-2010: the interpretation of noun compounds using paraphrasing verbs and prepositions. The task is meant to provide a standard testbed for future research on noun compound semantics. It should also promote paraphrase-based approaches to the problem, which can benefit many NLP applications. 1
Improved Statistical Machine Translation Using Monolingual Paraphrases
"... Abstract. We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free ” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recu ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free ” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa – preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33%-50 % of that of doubling the amount of training data. 1
Translating Compounds by Learning Component Gloss Translation Models via Multiple Languages
"... This paper presents an approach to the translation of compound words without the need for bilingual training text, by modeling the mapping of literal component word glosses (e.g. “iron-path”) into fluent English (e.g. “railway”) across multiple languages. Performance is improved by adding component- ..."
Abstract
- Add to MetaCart
This paper presents an approach to the translation of compound words without the need for bilingual training text, by modeling the mapping of literal component word glosses (e.g. “iron-path”) into fluent English (e.g. “railway”) across multiple languages. Performance is improved by adding component-sequence and learnedmorphology models along with context similarity from monolingual text and optional combination with traditional bilingual-textbased translation discovery. 1
Bulgarian Academy of Sciences
"... Abstract. The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds’ semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In ou ..."
Abstract
- Add to MetaCart
Abstract. The paper addresses an important challenge for the automatic processing of English written text: understanding noun compounds’ semantics. Following Downing (1977) [1], we define noun compounds as sequences of nouns acting as a single noun, e.g., bee honey, apple cake, stem cell, etc. In our view, they are best characterised by the set of all possible paraphrasing verbs that can connect the target nouns, with associated weights, e.g., malaria mosquito can be represented as follows: carry (23), spread (16), cause (12), transmit (9), etc. These verbs are directly usable as paraphrases, and using multiple of them simultaneously yields an appealing fine-grained semantic representation. In the present paper, we describe the process of constructing such representations for 250 noun-noun compounds previously proposed in the linguistic literature by Levi (1978) [2]. In particular, using human subjects recruited through Amazon Mechanical Turk Web Service, we create a valuable manually-annotated resource for noun compound interpretation, which we make publicly available with the hope to inspire further research in paraphrase-based noun compound interpretation. We further perform a number of experiments, including a comparison to automatically generated weight vectors, in order to assess the dataset quality and the feasibility of the idea of using paraphrasing verbs to characterise noun compounds ’ semantics; the results are quite promising.

