Results 1 -
8 of
8
The EuTRANS-I Speech Translation System
, 1999
"... The EuTRANS project aims at using Example-Based approaches for the automatic development of Machine Translation systems --accepting text and speech input-- for limited domain applications. During the first phase of the project, a speech translation system that is based on the use of automatically le ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
The EuTRANS project aims at using Example-Based approaches for the automatic development of Machine Translation systems --accepting text and speech input-- for limited domain applications. During the first phase of the project, a speech translation system that is based on the use of automatically learnt Subsequential Transducers has been built. This paper contains a detailed and to a long extent self-contained overview of the transducer learning algorithms and system architecture, along with a new approach for using categories representing words or short phrases in both input and output languages. Experimental results using this approach are reported for a task involving the recognition and translation of sentences in the hotel reception communication domain, with a vocabulary of 683 words in Spanish. A translation word error rate of 1.97% is achieved in real time factor 2.7 in a Personal Computer.
Language Understanding and Subsequential Transducer Learning
, 1998
"... Language Understanding can be considered as the realization of a mapping from sentences of a natural language into a description of their meaning in an appropriate formal language. Under this viewpoint, the application of the Onward Subsequential Transducer Inference Algorithm (OSTIA) to Language Un ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Language Understanding can be considered as the realization of a mapping from sentences of a natural language into a description of their meaning in an appropriate formal language. Under this viewpoint, the application of the Onward Subsequential Transducer Inference Algorithm (OSTIA) to Language Understanding is considered. The basic version of OSTIA is reviewed and a new version is presented in which syntactic restrictions of the domain and/or range of the target transduction can effectively be taken into account. For experimentation purposes, a task proposed by Feldman et al. for assessing the capabilities of Language Learning and Understanding systems has been adopted and three increasingly difficult-tolearn semantic coding schemes have been defined for this task. In all cases the basic version of OSTIA has consistently proved able to learn very compact and accurate transducers from relatively small training sets of input-output examples of the task. Moreover, if the input sentences are corrupted with syntactic incorrectness or errors, the new version of OSTIA still provides understanding results that only degrade in a gradual and natural way.
Spoken-Language Machine Translation in Limited-Domain Tasks
- In Proceedings in Arti Intelligence: CRIM/FORWISS Workshop on Progress and Prospects of Speech Research and Technology
, 1994
"... Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limited-domain translation tasks. The finitestate nature of subsequential transducers makes their int ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limited-domain translation tasks. The finitestate nature of subsequential transducers makes their integration with well-known Continuous Speech Recognition technology both easy and efficient. A recent algorithm allows the automatic learning of these transducers, given a sufficiently large set of examples of sentences and their corresponding translations, and it also allows the incorporation of syntactic restrictions of the input and/or output languages. In this paper, we describe an implementation of a Speech Translation System for limited domains which is fully trainable and capable of real time translation from speech input.
Machine Translation using Neural Networks and Finite-State Models
, 1997
"... Both Neural Networks and Finite-State Models have recently proved to be encouraging approaches to Example-Based Machine Translation. This paper compares the translation performances achieved with the two techniques as well as the corresponding resources required. To this end, both Elman Simple Re ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Both Neural Networks and Finite-State Models have recently proved to be encouraging approaches to Example-Based Machine Translation. This paper compares the translation performances achieved with the two techniques as well as the corresponding resources required. To this end, both Elman Simple Recurrent Nets and Subsequential Transducers were trained to tackle a simple pseudo-natural machine translation task.
Automatic Language Identification with Sequences of Language-Independent Phoneme Clusters
, 1996
"... Automatic language identification involves analyzing language-specific features in speech to determine the language of an utterance without regard to topic, speaker or length of speech. Although much progress has been made in recent years, language identification systems have not been built on under ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Automatic language identification involves analyzing language-specific features in speech to determine the language of an utterance without regard to topic, speaker or length of speech. Although much progress has been made in recent years, language identification systems have not been built on underlying theory or linguistically meaningful design criteria. This thesis is motivated by the belief that features used to discriminate between languages should be linguistically sound; the result is a unique combination of design, theory and implementation. In this thesis a "word-spotting" algorithm is introduced motivated by a perceptual study [82] reporting that human subjects use language- dependent phonemes and short sequences to identify languages. In order to find an optimal set of phoneme-like tokens to represent speech in a linguistically meaningful way, a mathematical model of the discrimination between two languages is developed. This model permits the automatic design of a token representation of speech by selecting a list of discriminating "words" in a data-driven manner. The resulting system has the flexibility to automatically take into account the inherent structure of the languages to be discriminated. A second mathematical model is developed to measure the impact of inaccurate automatic alignment of tokens on language discrimination. This model indicates why some algorithms aiming to compensate for these inaccuracies have not been successful. The theoretical models and the "word"-spotting algorithms have been implemented and validated on both generated and real-world speech data. This dissertation makes several significant contributions: the design of a simple and linguistically sound language-identification module; a flexible automatic feature extraction algorithm; a mathematical model to estimate the discriminability of two languages; and a mathematical model to capture the impact of inaccurate alignment on the discriminability of two languages.
Machine Translation with Grammar Association:
"... Grammar Association is a technique for Machine Translation and Language Understanding introduced in 1993 by Vidal, Pieraccini and Levin. All the statistical and structural models involved in the translation process are automatically built from bilingual examples, and the optimal translation o ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Grammar Association is a technique for Machine Translation and Language Understanding introduced in 1993 by Vidal, Pieraccini and Levin. All the statistical and structural models involved in the translation process are automatically built from bilingual examples, and the optimal translation of new sentences can be efficiently found by Dynamic Programming algorithms. This paper presents and discusses Grammar Association state of the art, including a new statistical model: Loco C.
Word Categorization in Statistical Translation
, 1999
"... Word clustering methods and, in general, categorization techniques have been successfully used to reduce the number of parameters to be estimated in language and translation models. ..."
Abstract
- Add to MetaCart
Word clustering methods and, in general, categorization techniques have been successfully used to reduce the number of parameters to be estimated in language and translation models.
Learning Extended Finite State Models for Language Translation
, 1996
"... The use of Subsequential Transducers (a kind of FiniteState Models) in Automatic Translation applications is considered. A methodology that improves the performance of the learning algorithm by means of an automatic reordering of the output sentences is presented. This technique yields a greater deg ..."
Abstract
- Add to MetaCart
The use of Subsequential Transducers (a kind of FiniteState Models) in Automatic Translation applications is considered. A methodology that improves the performance of the learning algorithm by means of an automatic reordering of the output sentences is presented. This technique yields a greater degree of synchrony between the input and output samples. The proposedapproachleads to a reduction in the number of samples necessary to learn the transducer and a reduction in the size of the model so obtained.

