Results 1 -
8 of
8
A Survey of Current Paradigms in Machine Translation
"... This paper is a survey of the current machine translation research in the US, Europe and Japan. A short history of machine translation is presented first, followed by an overview of the current research work. Representative examples of a wide range of different approaches adopted by machine tran ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This paper is a survey of the current machine translation research in the US, Europe and Japan. A short history of machine translation is presented first, followed by an overview of the current research work. Representative examples of a wide range of different approaches adopted by machine translation researchers are presented. These are described in detail along with a discussion of the practicalities of scaling up these approaches for operational environments. In support of this discussion, issues in, and techniques for, evaluating machine translation systems are addressed.
oxyGen: A Language Independent Linearization Engine
"... This paper describes a language independent linearization engine, oxyGen. This system compiles target language grammars into programs that take feature graphs as inputs and generate word lattices that can be passed along to the statistical extraction module of the generation system Nitrogen. The ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
This paper describes a language independent linearization engine, oxyGen. This system compiles target language grammars into programs that take feature graphs as inputs and generate word lattices that can be passed along to the statistical extraction module of the generation system Nitrogen. The grammars are written using a flexible and powerful language, oxyL, that has the power of a programming language but focuses on natural language realization. This engine have been used successfully in creating an English linearization program that is currently used as part of a Chinese-English machine translation system. 1 Introduction This paper describes a language independent realization engine, oxyGen. This system compiles linearization grammars into programs that run independently of the grammar and the compilation engine. The grammars are written in oxyL, a powerful and flexible natural language grammar description language. The syntax of oxyL is described in the paper. Currently,...
Hybrid Natural Language Generation from Lexical Conceptual Structures
- MACHINE TRANSLATION
, 2003
"... This paper describes Lexogen, a system for generating natural-language sentences from Lexical Conceptual Structure, an interlingual representation. The system has been developed as part of a Chinese--English Machine Translation (MT) system; however, it is designed to be used for many other MT langua ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
This paper describes Lexogen, a system for generating natural-language sentences from Lexical Conceptual Structure, an interlingual representation. The system has been developed as part of a Chinese--English Machine Translation (MT) system; however, it is designed to be used for many other MT language pairs and natural language applications. The contributions of this work include: (1) development of a large-scale Hybrid Natural Language Generation system with language-independent components; (2) enhancements to an interlingual representation and associated algorithm for generation from ambiguous input; (3) development of an efficient reusable language-independent linearization module with a grammar description language that can be used with other systems; (4) improvements to an earlier algorithm for hierarchically mapping thematic roles to surface positions; and (5) development of a diagnostic tool for lexicon coverage and correctness and use of the tool for verification of English, Spanish, and Chinese lexicons. An evaluation of Chinese--English translation quality shows comparable performance with a commercial translation system. The generation system can also be extended to other languages and this is demonstrated and evaluated for Spanish.
Constraints on the Generation of Tense, Aspect, and Connecting Words from Temporal Expressions
, 2002
"... Generating language that reflects the temporal organization of represented knowledge requires a language generation model that integrates contemporary theories of tense and aspect, temporal representations, and methods to plan text. This paper presents a model that produces event combinations an ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Generating language that reflects the temporal organization of represented knowledge requires a language generation model that integrates contemporary theories of tense and aspect, temporal representations, and methods to plan text. This paper presents a model that produces event combinations and appropriate connecting words to relate them. We distinguish between inherent and non-inherent aspectual features of verbs and describe an algorithm that uses these features to select tense, aspect, and temporal connecting words for generating text based on time-stamped information. The main result of this work is the successful incorporation of constrained linguistic theories of tense and aspect in a self-contained module called CONGEN that produces a ranked list of temporal connectives and tense/aspect possibilities from pairs of timestamped literals. We show that the theoretical results described herein have been verified in a large-scale corpus analysis. The framework serves as the basis of a component designed to enhance the English output of a constrained generation system.
Building a Chinese-English Mapping Between Verb Concepts for Multilingual Applications
"... This paper addresses the problem of building conceptual resources for multilingual applications. We describe new techniques for large-scale construction of a Chinese-English lexicon for verbs, using thematic-role information to create links between Chinese and English conceptual information. We ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper addresses the problem of building conceptual resources for multilingual applications. We describe new techniques for large-scale construction of a Chinese-English lexicon for verbs, using thematic-role information to create links between Chinese and English conceptual information. We then present an approach to compensating for gaps in the existing resources. The resulting lexicon is used for multilingual applications such as machine translation and cross-language information retrieval.
Empirical Acquisition of Conceptual Distinctions via Dictionary Definitions
, 2004
"... This thesis discusses the automatic acquisition of conceptual distinctions using empirical methods, with an emphasis on semantic relations. The goal is to improve semantic lexicons for computational linguistics, but the work can be applied to general-purpose knowledge bases as well. ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This thesis discusses the automatic acquisition of conceptual distinctions using empirical methods, with an emphasis on semantic relations. The goal is to improve semantic lexicons for computational linguistics, but the work can be applied to general-purpose knowledge bases as well.
A Reference Manual to the Linearization Engine oxyGen - Version 1.6
, 2001
"... Meaning Representation . . . . . . . . . . . . . . . . . . 6 2.1.1 OxyL Basic Tokens . . . . . . . . . . . . . . . . . . . . . . 8 2.2 oxyL File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 oxyL Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Sample oxyL ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Meaning Representation . . . . . . . . . . . . . . . . . . 6 2.1.1 OxyL Basic Tokens . . . . . . . . . . . . . . . . . . . . . . 8 2.2 oxyL File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 oxyL Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Sample oxyL Grammar for English 13 3.1 The oxyL File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 oxyGen Reference 17 4.1 oxyGen Package . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1.1 oxyGen Installation . . . . . . . . . . . . . . . . . . . . . 17 4.1.2 oxyCompile . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1.3 oxyRun . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1.4 oxyLin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.5 oxyDebug . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3 Built-in Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.4 Built-in Recasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.5 Reserved Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.5.1 Reserved Variables . . . . . . . . . . . . . . . . . . . . . . 25 4.5.2 Reserved Roles . . . . . . . . . . . . . . . . . . . . . . . . 25 4.5.3 Reserved Functions . . . . . . . . . . . . . . . . . . . . . . 25 4.5.4 Reserved Strings . . . . . . . . . . . . . . . . . . . . . . . 26 1 Chapter 1 oxyGen 1.1
Construction of a Chinese-English Verb Lexicon for Machine Translation
, 2002
"... This paper addresses the problem of automatic acquisition of lexical knowledge for rapid construction of MT engines in multilingual applications. We describe new techniques for large-scale construction of a Chinese-English verb lexicon and we evaluate the coverage and effectiveness of the resulting ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper addresses the problem of automatic acquisition of lexical knowledge for rapid construction of MT engines in multilingual applications. We describe new techniques for large-scale construction of a Chinese-English verb lexicon and we evaluate the coverage and effectiveness of the resulting lexicon. Leveraging off an existing Chinese conceptual database called HowNet and a large, semantically rich English verb database, we use thematic-role information to create links between Chinese concepts and English classes. We apply the metrics of recall and precision to evaluate the coverage and effectiveness of the linguistic resources. The results of this work indicate that: (1) we are able to obtain reliable Chinese-English entries both with and without pre-existing semantic links between the two languages; (2) if we have pre-existing semantic links, we are able to produce a more robust lexical resource by merging these with our semantically rich English database; (3) In our comparisons with manual lexicon creation, our automatic techniques were shown to achieve 62% precision, compared to a much lower precision of 10% for arbitrary assignment of semantic links.

