Results 1 -
7 of
7
Fast search for large vocabulary speech recognition
- in Verbmobil: Foundations of Speech-to-Speech Translation, W. Wahlster, Ed
, 2000
"... Abstract. In this article we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on both within-word and across-word phoneme models. We also study incremental methods to reduce the res ..."
Abstract
-
Cited by 11 (11 self)
- Add to MetaCart
Abstract. In this article we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on both within-word and across-word phoneme models. We also study incremental methods to reduce the response time of the online speech recognizer. Finally, we present experimental off-line results for the three VERBMOBIL scenarios. We report on word error rates and real-time factors for both speaker independent and speaker dependent recognition. 1
The Philips/RWTH System for Transcription of Broadcast News
, 1999
"... This paper contains a description of the Philips/RWTH 1998 HUB4 system which has been build in a joint effort of Philips Research Laboratories Aachen and Aachen University of Technology. We will focus our discussion on recent improvements compared to the original 1997 HUB4 system and evaluate them o ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper contains a description of the Philips/RWTH 1998 HUB4 system which has been build in a joint effort of Philips Research Laboratories Aachen and Aachen University of Technology. We will focus our discussion on recent improvements compared to the original 1997 HUB4 system and evaluate them on the HUB4'97 evaluation data. The paper will deal with 1. a rough system overview including feature extraction, acoustic training, audio stream segmentation, and decoding 2. log-linear interpolation of distance-language models, 3. and the integration of various acoustic and language models via Discriminative Model Combination (DMC).
Advances in Confidence Measures for Large Vocabulary
- in Proc. ICSLP
, 1999
"... This paper adresses the correct choice and combination of confidence measures in large vocabulary speech recognition tasks. We classify single words within continuous as well as large vocabulary utterances into two categories: utterances within the vocabulary which are recognized correctly, and othe ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper adresses the correct choice and combination of confidence measures in large vocabulary speech recognition tasks. We classify single words within continuous as well as large vocabulary utterances into two categories: utterances within the vocabulary which are recognized correctly, and other utterances, namely misrecognized utterances or (less frequent) out-of-vocabulary (OOV).
Automatic Transcription Of English Broadcast News
- Proc. of the DARPA Broadcast News Transcription and Understanding Workshop
, 1998
"... In this paper the Philips Broadcast News transcription system is described. The Broadcast News task aims at the recognition of "found" speech in radio and television broadcasts without any additional side information (e.g. speaking style, background conditions). The system was derived from the Phili ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper the Philips Broadcast News transcription system is described. The Broadcast News task aims at the recognition of "found" speech in radio and television broadcasts without any additional side information (e.g. speaking style, background conditions). The system was derived from the Philips continuous mixture density crossword HMM system, using MFCC features and Laplacian densities. A segmentation was performed to obtain sentence-like partitions of the broadcasts. Using data-driven clustering, the obtained segments were grouped into clusters with similar acoustic conditions for adaptation purposes. Gender independent wordinternal and crossword triphone models were trained on 70 hours of the HUB4 training data. No focus condition specific training was applied. Channel and speaker normalization was done by mean and variance normalization as well as VTN and MLLR. The transcription was produced by an adaptive multiple pass decoder starting with phrase-bigram decoding using word-...
Acoustic Modeling in the Philips Hub-4 Continuous-Speech Recognition System
, 1998
"... In this paper we describe some characteristics of the acoustic modeling used in the Philips continuous-speech recognition system for the DARPA Hub-4 1997 evaluation, which are related to robustness issues. We aimed at a conceptually simple system: We trained two model sets on 70 hours of the Hub4 tr ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper we describe some characteristics of the acoustic modeling used in the Philips continuous-speech recognition system for the DARPA Hub-4 1997 evaluation, which are related to robustness issues. We aimed at a conceptually simple system: We trained two model sets on 70 hours of the Hub4 training data, one for within-word and one for cross-word decoding. These model sets were used for both genders and all environmental conditions. In order to be able to do so, channel normalization (mean, variance normalization) and speaker normalization (vocal tract length normalization, realized by an appropriate shift of the center frequencies of the mel filter bank) have been applied, as well as adaptation techniques. MLLR-based unsupervised batch adaptation on clusters of segments was conducted both after a first withinword decoding and a cross-word decoding pass. The training strategy and the effects of the various normalization and adaptation techniques will be discussed in the paper. ...
Within-Word vs. Across-Word Decoding for Online Speech Recognition
- in Proc. Automatic Speech Recognition Workshop
, 2000
"... In this paper we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on both within-word and across-word phoneme models. The recognizer in the VERBMOBIL project is used in an online en ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this paper we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on both within-word and across-word phoneme models. The recognizer in the VERBMOBIL project is used in an online environment. We will discuss some incremental methods to reduce the response time of an on-line speech recognizer. We present experimental off-line results for the VERBMOBIL task, a German spontaneous speech corpus, and report on word error rates and real time performance of the search for both within-word and across-word phoneme models. 1. INTRODUCTION The goal of the VERBMOBIL project is to develop a speaker-independent speech-to-speech translation system that performs close to real-time. In this system, speech recognition is followed by subsequent VERBMOBIL modules (like syntactic analysis and translation) which depend on the recognition result. Therefore, in this application it is partic...
LargevocabuCC, continu,x
, 2002
"... Au,u,4, speech recognition of real-live broadcast news (BN) data(Hu,;: has become a challenging research topic in recent years. This papersur,#CC4; ou key e#orts tobu:6 a largevocabu:6: continu6: speech recognition system for the heterogenou BN taskwithou induuq uduuq6 complexity andcompu4q, ..."
Abstract
- Add to MetaCart
Au,u,4, speech recognition of real-live broadcast news (BN) data(Hu,;: has become a challenging research topic in recent years. This papersur,#CC4; ou key e#orts tobu:6 a largevocabu:6: continu6: speech recognition system for the heterogenou BN taskwithou induuq uduuq6 complexity andcompu4q,x;:# resou4q,x These key e#orts inclu,CC .

