Results 1 -
2 of
2
A COMBINED APPROACH FOR ESTIMATING A FEATURE-DOMAIN REVERBERATION MODEL IN NON-DIFFUSE ENVIRONMENTS
"... A combined approach for estimating a feature-domain reverberation model suitable for the robust distant-talking automatic speech recognition concept REMOS (REverberation MOdeling for Speech recognition) [1] is proposed. Based on a few calibration utterances recorded in the target environment, the co ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A combined approach for estimating a feature-domain reverberation model suitable for the robust distant-talking automatic speech recognition concept REMOS (REverberation MOdeling for Speech recognition) [1] is proposed. Based on a few calibration utterances recorded in the target environment, the combined approach employs ML estimation and blind estimation of the reverberation time to determine a two-slope reverberation model. Since measurements of room impulse responses become unnecessary, the effort for training is greatly reduced compared to [1] and compared to training HMMs on artificially reverberated data. Connected digit recognition experiments show that the proposed reverberation models in connection with the REMOS concept significantly outperform HMM-based recognizers trained on reverberant data. Index Terms — Dereverberation, blind estimation, reverberation model, reverberation time, robust ASR.
BLIND ESTIMATION OF A FEATURE-DOMAIN REVERBERATION MODEL IN NON-DIFFUSE ENVIRONMENTS WITH VARIANCE ADJUSTMENT
"... Blind estimation of a two-slope feature-domain reverberation model is proposed. The reverberation model is suitable for robust distant-talking automatic speech recognition approaches which use a convolution in the feature domain to characterize the reverberant feature vector sequence, e.g. [1, 2, 3] ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Blind estimation of a two-slope feature-domain reverberation model is proposed. The reverberation model is suitable for robust distant-talking automatic speech recognition approaches which use a convolution in the feature domain to characterize the reverberant feature vector sequence, e.g. [1, 2, 3]. Since the model describes the reverberation by a matrix-valued IID Gaussian random process, its statistical properties are completely captured by its mean and variance matrices. The suggested solution for the estimation of the model includes two novel features based on the study of simulated rooms: 1) a solution for blindly determining a twoslope decay model from a single-slope estimate; 2) a variance mask to improve the estimation of the variance matrix. Using the proposed solution, the reverberation model can be estimated during recognition without the need of pre-training or using calibration utterances with known transcription. Connected digit recognition experiments using [3] show that the reverberation models estimated by the proposed approach significantly outperform HMM-based recognizers trained on reverberant data in most environments. 1.

