Results 1 -
7 of
7
VERBMOBIL: The Use of Prosody in the Linguistic Components of a Speech Understanding System
, 2000
"... In this paper, we show how prosody can be used in speech understanding systems. This is demonstrated with the VERBMOBIL speech-to-speech translation system which, to our knowledge, is the first complete system which successfully uses prosodic information in the linguistic analysis. Prosody is used b ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
In this paper, we show how prosody can be used in speech understanding systems. This is demonstrated with the VERBMOBIL speech-to-speech translation system which, to our knowledge, is the first complete system which successfully uses prosodic information in the linguistic analysis. Prosody is used by computing probabilities for clause boundaries, accentuation, and different types of sentence mood for each of the word hypotheses computed by the word recognizer. These probabilities guide the search of the linguistic analysis. Disambiguation is already achieved during the analysis and not by a prosodic verification of different linguistic hypotheses. So far, the most useful prosodic information is provided by clause boundaries. These are detected with a recognition rate of 94%. For the parsing of word hypotheses graphs, the use of clause boundary probabilities yields a speed-up of 92% and a 96% reduction of alternative readings.
Dynamic Bayesian Networks for Information Fusion with Applications to Human-Computer Interfaces
, 1999
"... Recent advances in various display and virtual technologies coupled with an explosion in available computing power have given rise to a numberofnovel human-computer interaction (HCI) modalities -- speech, vision-based gesture recognition, eye tracking, EEG, etc. However, despite the abundance of nov ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Recent advances in various display and virtual technologies coupled with an explosion in available computing power have given rise to a numberofnovel human-computer interaction (HCI) modalities -- speech, vision-based gesture recognition, eye tracking, EEG, etc. However, despite the abundance of novel interaction devices, the naturalness and efficiency of HCI has remained low. This is due in particular to the lack of robust sensory data interpretation techniques. To deal with the task of interpreting single and multiple interaction modalities this dissertation establishes a novel probabilistic approach based on dynamic Bayesian networks (DBNs). As a generalization of the successful hidden Markov models, DBNs are a natural basis for the general temporal action interpretation task. The problem of interpretation of single or multiple interacting modalities can then be viewed as a Bayesian inference task. In this work three complex DBN models are introduced: mixtures of DBNs, mixed-state DBNs, and coupled HMMs. In-depth study of these models yields efficient approximate inference and parameter learning techniques applicable to a wide variety of problems. Experimental validation of the proposed approaches in the domains of gesture and speech recognition con rms the model's applicability to both unimodal and multimodal interpretation tasks.
Modeling with structures in statistical machine translation
- In COLING-ACL
, 1998
"... {yyw, waibel}©cs, cmu. edu Most statistical machine translation systems employ a word-based alignment model. In this paper we demonstrate that word-based alignment is a major cause of translation errors. We propose a new alignment model based on shallow phrase structures, and the structures can be a ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
{yyw, waibel}©cs, cmu. edu Most statistical machine translation systems employ a word-based alignment model. In this paper we demonstrate that word-based alignment is a major cause of translation errors. We propose a new alignment model based on shallow phrase structures, and the structures can be automatically acquired from parallel corpus. This new model achieved over 10 % error reduction for our spoken language translation task. 1
Grammar Inference and Statistical Machine Translation
, 1998
"... NLP researchers face a dilemma: on one side, it is unarguably accepted that languages have internal structure rather than strings of words. On the other side, they find it very difficult and expensive to write grammars that have good coverage of language structures. Statistical machine translation ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
NLP researchers face a dilemma: on one side, it is unarguably accepted that languages have internal structure rather than strings of words. On the other side, they find it very difficult and expensive to write grammars that have good coverage of language structures. Statistical machine translation tries to cope with this problem by ignoring language structures and using a statistical models to depict the translation process. Most of the translation models are word-based. While the approach has achieved surprisingly good performance comparable to the best commercial systems, many questions remain in the machine translation community. Can the statistical word-based translation still perform well on language pairs with radically different linguistic structures? How would it function with less training data or with spoken languages? The thesis work investigated these questions. In summary, word-based alignment model is a major cause of errors in German-English statistical spoken language...
Word clustering with parallel spoken language corpora
- In Proc. of 4th International Conference on Spoken Language Processing, ICSLP 96
, 1996
"... In this paper weintroduce a word clustering algorithm which uses a bilingual, parallel corpus to group together words in the source and target language. Our method generalizes previous mutual information clustering algorithms for monolingual data by incorporating a statistical translation model. Pre ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
In this paper weintroduce a word clustering algorithm which uses a bilingual, parallel corpus to group together words in the source and target language. Our method generalizes previous mutual information clustering algorithms for monolingual data by incorporating a statistical translation model. Preliminary experiments have shown that the algorithm can e ectively employ the constraints implicit in bilingual data to extract classes which are well-suited to machine translation tasks. 1.
Statistical Analysis of Dialogue Structure
- In Proceedings of EuroSpeech Conference. Rhodes
, 1997
"... We introduce a statistical model for dialogues. We describe a dynamic programming algorithm that can be used to bracket a dialogue into segments and label each segment with its speech act. We evaluate the performance of the model. We also use this model for language modelling and get perplexity redu ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We introduce a statistical model for dialogues. We describe a dynamic programming algorithm that can be used to bracket a dialogue into segments and label each segment with its speech act. We evaluate the performance of the model. We also use this model for language modelling and get perplexity reduction. 1
Connectionist transfer in machine translation
- In Proceedings of the International Conference on Recent Advances in Natural Language Processing
, 1995
"... A traditional transfer system in machine translation maps between language structures and an intermediate representation. Our connectionist transfer system maps from f-structures of one language to f-structures of another language. It encodes the intermediate representation implicitly in neural netw ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
A traditional transfer system in machine translation maps between language structures and an intermediate representation. Our connectionist transfer system maps from f-structures of one language to f-structures of another language. It encodes the intermediate representation implicitly in neural networks ' activation patterns. The system is learnable, therefore it does not need any e ort in hand-crafting the representation and mapping rules. Experiments show the system has good scalability and generalizability performance. 1

