## ADAPTING HMMS OF DISTANT-TALKING ASR SYSTEMS USING FEATURE-DOMAIN REVERBERATION MODELS

### BibTeX

@MISC{Sehr_adaptinghmms,

author = {Armin Sehr and Markus Gardill and Walter Kellermann},

title = {ADAPTING HMMS OF DISTANT-TALKING ASR SYSTEMS USING FEATURE-DOMAIN REVERBERATION MODELS},

year = {}

}

### OpenURL

### Abstract

To capture the dispersive effect of reverberation by Hidden Markov Model (HMM)-based distant-talking speech recognition systems, adapting the means of the current HMM state based on the means of the preceding states has been suggested in [1]. In this contribution, we propose to incorporate the reverberation models of [2] into the adaptation approach to describe the effect of reverberation with higher accuracy. Connected-digit recognition experiments in three different rooms confirm that the suggested more accurate reverberation representation leads to a significant performance increase in all investigated environments. 1.

### Citations

4300 | A tutorial on hidden Markov models and selected applications in speech recognition
- Rabiner
- 1989
(Show Context)
Citation Context ...re vector sequences in a certain room. An HMM is defined by the matrix of state transition probabilities, the vector of initial state occupation probabilities, and the output densities for each state =-=[7]-=-. Usually Gaussian mixture densities, completely described by a set of mean vectors, a set of diagonal covariance matrices, and a set of mixture weights, are used to model the output densities in the ... |

168 |
A database for speaker-independent digit recognition
- Leonard
(Show Context)
Citation Context ...ree Gaussians serve as clean speech models. To get the reverberated test data (and the reverberated training data for the training of reverberant HMMs used for comparison), the clean speech TI digits =-=[11]-=- data are convolved with different RIRs measured at different loudspeaker and microphone positions in three rooms with the characteristics given in Table 1. Each test utterance is convolved with an RI... |

43 |
New method for measuring reverberation time
- Schroeder
- 1965
(Show Context)
Citation Context ...5 b). The exponential model depicted in subfigure a) does not capture this channel dependency. 2) Since real-world RIRs typically exhibit a two-sloped decay with a rapid initial and a slow late decay =-=[9]-=- as depicted in Figure 5 c), the strictly exponential decay can either capture the initial or the late decay with high accuracy. But it is not able to capture the two-sloped behavior. To tackle these ... |

37 | Recognizing reverberant speech with rasta-plp - Kingsbury, Morgan - 1997 |

19 | Training of HMM with Filtered Speech Material for Hands-Free Speech Recognition
- Giuliani, Matassoni, et al.
- 1999
(Show Context)
Citation Context ...forward way is to use reverberant training data to train HMMs. To reduce the effort for data collection, clean training data can be convolved with RIRs to obtain the reverberated data as suggested in =-=[5]-=-. Instead of performing a complete training on reverberated data, the mean vectors of clean HMMs can be adapted to the reverberation conditions of a certain room by taking the means of the preceding s... |

10 | Distant-talking continuous speech recognition based on a novel reverberation model
- Sehr, Zeller, et al.
- 2006
(Show Context)
Citation Context .... (See [8] for details on the start and end time calculation as well as on the adaptation procedure for dynamic features.) 2.2 Reverberation Model A statistical ReVerberation Model (RVM) η is used in =-=[2]-=- for robust distant-talking ASR. This RVM can be considered as a feature-domain representation of all possible RIRs for arbitrary speaker and microphone positions in a certain room. The RVM exhibits a... |

8 |
A new approach for the adaptation of HMMs to reverberation and background noise
- Hirsch, Finster
- 2008
(Show Context)
Citation Context ...ies, completely described by a set of mean vectors, a set of diagonal covariance matrices, and a set of mixture weights, are used to model the output densities in the MFCC domain. Since, according to =-=[8]-=-, the adaptation of the covariance matrices has only a minor effect on the recognition performance, only the mean vectors of the HMMs are adapted in [1]. Since the adaptation is performed in the melsp... |

7 |
A new HMM adaptation approach for the case of a hands-free speech input in reverberant rooms
- Hirsch, Finster
- 2006
(Show Context)
Citation Context ...verberation by Hidden Markov Model (HMM)-based distant-talking speech recognition systems, adapting the means of the current HMM state based on the means of the preceding states has been suggested in =-=[1]-=-. In this contribution, we propose to incorporate the reverberation models of [2] into the adaptation approach to describe the effect of reverberation with higher accuracy. Connected-digit recognition... |

6 |
Model adaptation for long convolutional distortion by maximum likelihood state filtering approach
- Raut, Nishimoto, et al.
(Show Context)
Citation Context ...ming a complete training on reverberated data, the mean vectors of clean HMMs can be adapted to the reverberation conditions of a certain room by taking the means of the preceding states into account =-=[1, 6]-=-. 20 0 −20 −40 −60 20 0 −20 −40 −60 20 0 −20 −40 −60 © EURASIP, 2009 540Since the feature-domain RIR representation used in [1] is based on a frequency-independent strictly exponential decay, only th... |

5 |
Towards robust distant-talking automatic speech recognition in reverberant environments
- Sehr, Kellermann
- 2008
(Show Context)
Citation Context ...ted versions of the previous feature vectors. The effect of reverberation on speech feature sequences can be captured approximately by a convolution in the melspectral (melspec) (see Figure 1) domain =-=[4]-=- as given by M−1 xmel (l,k) ≈ ∑ m=0 h mel (l,m) s mel (l,k − m) , (1) where x mel (l,k), h mel (l,m), and s mel (l,k−m) denote the melspec representations of channel l and frame k for the reverberant ... |

2 | A combined approach for estimating a feature-domain reverberation model in nondiffuse environments
- Sehr, Wen, et al.
- 2008
(Show Context)
Citation Context ...gorithms. Furthermore, the RVMs used in the proposed approach can be estimated prior to the recognition either by measuring RIRs in the target environment [2] or by using a few calibration utterances =-=[12]-=-. Since the RVMs can be estimated completely independently of the HMMs and the complexity of the adaptation is very low, the proposed approach is extremely flexible. Moreover, the offline estimation o... |