Results 1 - 10
of
11
The challenge of spoken language systems: Research directions for the nineties
- IEEE Transactions on Speech and Audio Processing
, 1995
"... Footnote This article is based on a February, 1992workshop sponsored by the National Science ..."
Abstract
-
Cited by 34 (5 self)
- Add to MetaCart
Footnote This article is based on a February, 1992workshop sponsored by the National Science
JANUS 93: Towards Spontaneous Speech Translation
, 1994
"... We present first results from our efforts toward translation of spontaneously spoken speech. Improvements include increasing coverage, robustness, generality and speed of JANUS, the speech-to-speech translation system of Carnegie Mellon and Karlsruhe University. Recognition and Machine Translation E ..."
Abstract
-
Cited by 19 (12 self)
- Add to MetaCart
We present first results from our efforts toward translation of spontaneously spoken speech. Improvements include increasing coverage, robustness, generality and speed of JANUS, the speech-to-speech translation system of Carnegie Mellon and Karlsruhe University. Recognition and Machine Translation Engine have been upgraded to deal with requirements introduced by spontaneous human to human dialogs. To allow for development and evaluation of our system on adequate data, a large database with spontaneous scheduling dialogs is being gathered for English, German and Spanish. 1. OVERVIEW JANUS [1, 2] has been among early systems to attempt the translation of spoken dialogs. It had initially been built based on a speech database of 12 read dialogs of the conference registration task, encompassing a vocabulary of around 500 words. It was designed as a speaker-independent system which translates spoken utterances from English and also from German into one of German, English or Japanese. Speech...
Grammar Inference and Statistical Machine Translation
, 1998
"... NLP researchers face a dilemma: on one side, it is unarguably accepted that languages have internal structure rather than strings of words. On the other side, they find it very difficult and expensive to write grammars that have good coverage of language structures. Statistical machine translation ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
NLP researchers face a dilemma: on one side, it is unarguably accepted that languages have internal structure rather than strings of words. On the other side, they find it very difficult and expensive to write grammars that have good coverage of language structures. Statistical machine translation tries to cope with this problem by ignoring language structures and using a statistical models to depict the translation process. Most of the translation models are word-based. While the approach has achieved surprisingly good performance comparable to the best commercial systems, many questions remain in the machine translation community. Can the statistical word-based translation still perform well on language pairs with radically different linguistic structures? How would it function with less training data or with spoken languages? The thesis work investigated these questions. In summary, word-based alignment model is a major cause of errors in German-English statistical spoken language...
Audio-visual and Multimodal Speech Systems
- In D. Gibbon (Ed.) Handbook of Standards and Resources for Spoken Language Systems - Supplement Volume
"... ion Signal Level Semantic Level Figure 13: Multimodal Design Space (adapted from [224]) system in the design space is the pivotal center of its features. According to the characterization of an interaction along the two dimensions, fusion, and use of modalities, four basic types of multimodal intera ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
ion Signal Level Semantic Level Figure 13: Multimodal Design Space (adapted from [224]) system in the design space is the pivotal center of its features. According to the characterization of an interaction along the two dimensions, fusion, and use of modalities, four basic types of multimodal interactions can be distinguished: alternative, synergistic, exclusive, and concurrent multimodal interaction, as shown in Figure 13. Obviously, synergistic systems subsume the other three classes of multimodal systems. Therefore, architectural models of multimodal integration (as presented in the next subsection and in Section 9) are sufficient if they are able to model synergistic cooperation of modalities. 6.2.2 Fusion of Multimodal Input Fusion of multimodal input events can occur on different levels, ranging from signal-level to semantic-level. Signal-level fusion (or lexical fusion [224]) performs the combination of multimodal input at the level of the input signal. Signal-level fusion has...
Workshop on Spoken Language Understanding - A Workshop sponsored by the National Science Foundation
, 1992
"... This report describes the key research topics, the expected benefits of the research, and recommendations to NSF on the infrastructure needed to support the research. ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This report describes the key research topics, the expected benefits of the research, and recommendations to NSF on the infrastructure needed to support the research.
Spoken-Language Machine Translation in Limited-Domain Tasks
- In Proceedings in Arti Intelligence: CRIM/FORWISS Workshop on Progress and Prospects of Speech Research and Technology
, 1994
"... Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limited-domain translation tasks. The finitestate nature of subsequential transducers makes their int ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Subsequential transducers constitute a formal model for translation that may be considered perhaps too simple to model translation between natural languages. However, their capability can suffice in limited-domain translation tasks. The finitestate nature of subsequential transducers makes their integration with well-known Continuous Speech Recognition technology both easy and efficient. A recent algorithm allows the automatic learning of these transducers, given a sufficiently large set of examples of sentences and their corresponding translations, and it also allows the incorporation of syntactic restrictions of the input and/or output languages. In this paper, we describe an implementation of a Speech Translation System for limited domains which is fully trainable and capable of real time translation from speech input.
Multilinguality
"... he multilingual problems just 282 Chapter 8: Multilinguality identified, the only one that might possibly be treated with a character-oriented model is that of language identification. The remainder trade in an essential way on equivalences, or near equivalences, among words, sentences, and texts m ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
he multilingual problems just 282 Chapter 8: Multilinguality identified, the only one that might possibly be treated with a character-oriented model is that of language identification. The remainder trade in an essential way on equivalences, or near equivalences, among words, sentences, and texts mediated through their meaning. Language processing of this kind is notoriously difficult and it behooves us to start by considering, however cursorily, why this is. We will do this in the context of translation, though what we say is true for the most part of the other tasks mentioned. The question of why translation should have been so successful in resisting the most determined efforts to automate it for close to forty years is complex and sometimes quite technical. But it is not a mystery. The basic problems have long been known and, the most important thing that has been learnt about them recently is that they are more severe and more widespread than was first thoug
Janus 93: Towards Spontaneous Speech Translation
, 1994
"... We present first results from our efforts toward translation of spontaneously spoken speech. Improvements include increasing coverage, robustness, generality and speed of JANUS, the speech-to-speech translation system of Carnegie Mellon and Karlsruhe University. Recognition and Machine Translation E ..."
Abstract
- Add to MetaCart
We present first results from our efforts toward translation of spontaneously spoken speech. Improvements include increasing coverage, robustness, generality and speed of JANUS, the speech-to-speech translation system of Carnegie Mellon and Karlsruhe University. Recognition and Machine Translation Engine have been upgraded to deal with requirements introduced by spontaneous human to human dialogs. To allow for development and evaluation of our system on adequate data, a large database with spontaneous scheduling dialogs is being gathered for English, German and Spanish. 1. OVERVIEW JANUS [1, 2] has been among early systems to attempt the translation of spoken dialogs. It had initially been built based on a speech database of 12 read dialogs of the conference registration task, encompassing a vocabulary of around 500 words. It was designed as a speaker-independent system which translates spoken utterances from English and also from German into one of German, English or Japanese. Speec...
An Experimental Japanese / English Interpreting Video Phone System
"... In this paper we report on the architectural design issues and experiences gained while building and demonstrating an experimental interpreting video phone (IVP) system. The IVP system has been demonstrated in an internet home shopping simulation simultaneously before live audiences in Japan and the ..."
Abstract
- Add to MetaCart
In this paper we report on the architectural design issues and experiences gained while building and demonstrating an experimental interpreting video phone (IVP) system. The IVP system has been demonstrated in an internet home shopping simulation simultaneously before live audiences in Japan and the U.S. An American shop assistant and a Japanese customer engaged in task-directed dialogues, using their native languages. In addition to their direct audio/visual contact by ISDN video phone, each participant heard a translation of the remote speaker's utterances in a synthetic voice in real-time.
An Experimental Japanese / English
"... In this paper we report on the architectural design issues and experiences gained while building and demonstrating an experimental interpreting video phone (IVP) system. The IVP system has been demonstrated in an internet home shopping simulation simultaneously before live audiences in Japan and the ..."
Abstract
- Add to MetaCart
In this paper we report on the architectural design issues and experiences gained while building and demonstrating an experimental interpreting video phone (IVP) system. The IVP system has been demonstrated in an internet home shopping simulation simultaneously before live audiences in Japan and the U.S. An American shop assistant and a Japanese customer engaged in task-directed dialogues, using their native languages. In addition to their direct audio/visual contact by ISDN video phone, each participant heard a translation of the remote speaker's utterances in a synthetic voice in real-time.

