Results 1 -
5 of
5
A survey of statistical machine translation
, 2007
"... Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular tec ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.
Representation of American Sign Language for Machine Translation
, 2001
"... This dissertation describes an approach to designing a machine translation system that generates a representation of American Sign Language (ASL) from English. ASL uses space and non-manual signals (NMSs) to encode grammatical features such as agreement, negation, wh-questions, etc. Previous computa ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
This dissertation describes an approach to designing a machine translation system that generates a representation of American Sign Language (ASL) from English. ASL uses space and non-manual signals (NMSs) to encode grammatical features such as agreement, negation, wh-questions, etc. Previous computational systems for ASL are typically hindered by static representations of ASL signs, which makes it computationally impractical to represent the large number of possible surface forms for each sign, and by the use of notation systems that cannot represent such variation. The approach developed here addresses these limitations. The representation of ASL is based on the Move-Hold (MH) model (Liddell and Johnson 1989), a sign notation system that allows for both precision of sign description and predictable variation of surface forms based on grammatical detail. Empty features are used in MH notations of lexical forms, which are instantiated with spatial data during generation. The generation system is implemented as an LFG correspondence architecture (Kaplan and Bresnan 1982, Kaplan et al 1989). Correspondence functions are defined that convert an English f-structure into an ASL f-structure; build an ASL c-structure from the f-structure; and build the phonetic representation level (p-structure, where spatial and non-manual variations are revealed) from the c-structure. The concepts presented in this dissertation have been implemented in a software application, ASL Workbench. Possible future applications of this work include developing animated output, tagged corpora for linguistic analysis, and shared lexicons for gloss standardization, among others.
MACHINE TRANSLATION BY PATTERN MATCHING
, 2008
"... The best systems for machine translation of natural language are based on statistical models learned from data. Conventional representation of a statistical translation model requires substantial offline computation and representation in main memory. Therefore, the principal bottlenecks to the amoun ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The best systems for machine translation of natural language are based on statistical models learned from data. Conventional representation of a statistical translation model requires substantial offline computation and representation in main memory. Therefore, the principal bottlenecks to the amount of data we can exploit and the complexity of models we can use are available memory and CPU time, and current state of the art already pushes these limits. With data size and model complexity continually increasing, a scalable solution to this problem is central to future improvement. Callison-Burch et al. (2005) and Zhang and Vogel (2005) proposed a solution that we call translation by pattern matching, which we bring to fruition in this dissertation. The training data itself serves as a proxy to the model; rules and parameters are computed on demand. It achieves our desiderata of minimal offline computation and compact representation, but is dependent on fast pattern matching algorithms on text. They demonstrated its application to a common model based on the translation of contiguous substrings, but leave some open problems. Among these is a question: can this approach match the performance of conventional methods despite unavoidable differences that it induces in the model? We show how to answer this question affirmatively. The main
Translation Shared Task on Statistical Machine Translation: A Comparison of the Systems Output
"... The ACL Workshop on Statistical Machine Translation proposed a translation shared task focused on European language pairs. Participants used their systems to translate into the target language a test set of unseen sentences in the source language. Involved languages were French, English, Spanish, Ge ..."
Abstract
- Add to MetaCart
The ACL Workshop on Statistical Machine Translation proposed a translation shared task focused on European language pairs. Participants used their systems to translate into the target language a test set of unseen sentences in the source language. Involved languages were French, English, Spanish, German, Czech and Hungarian. The goal of this work is to quantitatively compare the translations generated by different systems. In particular, a selection of submitted runs for the French-English, German-English and Spanish-English tasks were analyzed. The systems involved in our investigation cover all the main approaches to machine translation, that is rule-based, statistical, example-based and hybrid.

