Results 1 -
5 of
5
The RWTH 2007 TC-STAR Evaluation System for European English and Spanish
, 2007
"... In this work, the RWTH automatic speech recognition systems developed for the third TC-STAR evaluation campaign 2007 are presented. The RWTH systems make systematic use of internal system combination, combining systems with differences in feature extraction, adaptation methods, and training data use ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
In this work, the RWTH automatic speech recognition systems developed for the third TC-STAR evaluation campaign 2007 are presented. The RWTH systems make systematic use of internal system combination, combining systems with differences in feature extraction, adaptation methods, and training data used. To take advantage of this, novel feature extraction methods were employed; this year saw the introduction of Gammatone features and MLP based phone posterior features. Further improvements were achieved using unsupervised training, and it is notable that these improvements were achieved using a fairly low amount of automatically transcribed data. Also contributing to the improvements over last year was the switch to MPE training, and the introduction of projecting SAT transforms.
Acoustic Feature Combination for Robust Speech Recognition
- Proc. IEEE Intern. Conf. on Acoustics, Speech, and Signal Processing
, 2005
"... In this paper, we consider the use of multiple acoustic features of the speech signal for robust speech recognition. We investigate the combination of various auditory based (Mel Frequency Cepstrum Coefficients, Perceptual Linear Prediction, etc.) and articulatory based (voicedness) features. Featur ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
In this paper, we consider the use of multiple acoustic features of the speech signal for robust speech recognition. We investigate the combination of various auditory based (Mel Frequency Cepstrum Coefficients, Perceptual Linear Prediction, etc.) and articulatory based (voicedness) features. Features are combined by a Linear Discriminant Analysis based and by a log-linear model combination based techniques. We describe the two feature combination techniques and compare the experimental results. Experiments performed on the large-vocabulary task VerbMobil II (German conversational speech) show that the accuracy of automatic speech recognition systems can be improved by the combination of different acoustic features. 1.
Signal Speech
"... ◮ Further development by several PhD students at i6 ◮ Today: standard system for all ASR research topics and projects ◮ Very flexible and extendable ◮ Framework also used for machine translation, video / image processing ..."
Abstract
- Add to MetaCart
◮ Further development by several PhD students at i6 ◮ Today: standard system for all ASR research topics and projects ◮ Very flexible and extendable ◮ Framework also used for machine translation, video / image processing
This work was carried out under the supevision of
"... I would like to thank my advisor Prof. Naftaly Tishby for his ideas, guidance, support, and especially for not giving up on me in the long time it took me to finish this thesis. I wish to thank Orit Shoraga, my friend and colleague for helping and working with me throughout this work. I also wish to ..."
Abstract
- Add to MetaCart
I would like to thank my advisor Prof. Naftaly Tishby for his ideas, guidance, support, and especially for not giving up on me in the long time it took me to finish this thesis. I wish to thank Orit Shoraga, my friend and colleague for helping and working with me throughout this work. I also wish to thank Ron Hecht for his help and cooperation in
FRAME-BASED ACOUSTIC FEATURE INTEGRATION FOR SPEECH UNDERSTANDING
"... With the purpose of improving Spoken Language Understanding (SLU) performance, a combination of different acoustic speech recognition (ASR) systems is proposed. State a-posteriori probabilities obtained with systems using different acoustic feature sets are combined with log-linear interpolation. In ..."
Abstract
- Add to MetaCart
With the purpose of improving Spoken Language Understanding (SLU) performance, a combination of different acoustic speech recognition (ASR) systems is proposed. State a-posteriori probabilities obtained with systems using different acoustic feature sets are combined with log-linear interpolation. In order to perform a coherent combination of these probabilities, acoustic models must have the same topology (i.e. same set of states). For this purpose, a fast and efficient twin model training protocol is proposed. By a wise choice of acoustic feature sets and log-linear interpolation of their likelihood ratios, a substantial Concept Error Rate (CER) reduction has been observed on the test part of the French MEDIA corpus. Index Terms — speech recognition, posterior probabilities combination, speech understanding, frame based combination 1.

