Results 1 
6 of
6
Covariance Modelling for NoiseRobust Speech Recognition
"... Model compensation is a standard way of improving speech recognisers’ robustness to noise. Most model compensation techniques produce diagonal covariances. However, this fails to handle any changes in the feature correlations due to the noise. This paper presents a scheme that allows fullcovariance ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
Model compensation is a standard way of improving speech recognisers’ robustness to noise. Most model compensation techniques produce diagonal covariances. However, this fails to handle any changes in the feature correlations due to the noise. This paper presents a scheme that allows fullcovariance matrices to be estimated. One problem is that full covariance matrix estimation will be more sensitive approximations, those for the dynamic parameters are known to crude. In this paper a linear transformation of a window of consecutive frames is used as the basis for dynamic parameter compensation. A second problem is that the resulting full covariance matrices slow down decoding. This is addressed by using predictive linear transforms that decorrelate the feature space, so that the decoder can then use diagonal covariance matrices. On a noisecorrupted Resource Management task, the proposed scheme outperformed the standard VTS compensation scheme.
ModelBased Approaches to Handling Uncertainty
"... Abstract A powerful approach for handling uncertainty in observations is to modify the statistical model of the data to appropriately reflect this uncertainty. For the task of noise robust speech recognition, this requires modifying an underlying ”clean” acoustic model to be representative of speech ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
Abstract A powerful approach for handling uncertainty in observations is to modify the statistical model of the data to appropriately reflect this uncertainty. For the task of noise robust speech recognition, this requires modifying an underlying ”clean” acoustic model to be representative of speech in a particular target acoustic environment. This chapter describes the underlying concepts of modelbased noise compensation for robust speech recognition and how it can be applied to standard systems. The chapter will then consider important practical issues. These include: i) acoustic environment noise parameter estimation; ii) efficient acoustic model compensation and likelihood calculation; iii) and adaptive training to handle multistyle training data. The chapter will conclude by discussing the limitations of the current approaches and research options to address them. 1
noiserobust speech recognition
, 2010
"... Model compensation techniques for noiserobust speech recognition approximate the corrupted speech distribution. This work introduces a sampling method that, given speech and noise distributions and a mismatch function, in the limit calculates the corrupted speech likelihood exactly. For this, it tr ..."
Abstract
 Add to MetaCart
(Show Context)
Model compensation techniques for noiserobust speech recognition approximate the corrupted speech distribution. This work introduces a sampling method that, given speech and noise distributions and a mismatch function, in the limit calculates the corrupted speech likelihood exactly. For this, it transforms the integral in the likelihood expression, and then applies sequential importance resampling. Though it is too slow to compensate a speech recognition system, it enables a more finegrained assessment of compensation techniques, based on the kl divergences to the ideal compensation for individual components. The kl divergence appears to predict the word error rate well. This technique also makes it possible to evaluate the impact of approximations that compensation schemes make. For example, this work examines the influence of the assumption that the corrupted speech distribution is Gaussian and diagonalising that Gaussian’s covariance. It also assesses the impact of a common approximation to the mismatch function for vts compensation, namely setting the
unknown title
"... This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 1 Exemplarbased sparse representations for noise robust automatic speech recognition Jort F. Gemmeke*, StudentMember, IEEE, Tuomas Virtane ..."
Abstract
 Add to MetaCart
(Show Context)
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. 1 Exemplarbased sparse representations for noise robust automatic speech recognition Jort F. Gemmeke*, StudentMember, IEEE, Tuomas Virtanen, Antti Hurmalainen Abstract—This paper proposes to use exemplarbased sparse representations for noise robust automatic speech recognition. First, we describe how speech can be modelled as a linear combination of a small number of exemplars from a large speech exemplar dictionary. The exemplars are timefrequency patches of real speech, each spanning multiple time frames. We then propose to model speech corrupted by additive noise as a linear combination of noise and speech exemplars, and we derive an algorithm for recovering this sparse linear combination of exemplars from the observed noisy speech. We describe how the framework can be used for doing hybrid exemplarbased/HMM recognition by using the exemplaractivations together with the phonetic information associated with the exemplars. As an alternative to hybrid recognition, the framework also allows us to take a source separation approach which enables exemplarbased feature enhancement as well as missing data mask estimation. We evaluate the performance of these exemplarbased methods in connected digit recognition on the AURORA2 database. Our results show that the hybrid system performed substantially better than source separation or missing data mask estimation at lower SNRs, achieving up to 57.1 % accuracy at SNR =5 dB. Although not as effective as two baseline recognisers at higher SNRs, the novel approach offers a promising direction of future research on exemplarbased ASR. Index Terms—Speech recognition, exemplarbased, noise robustness, sparse representations, nonnegative matrix factorisation
A Variational Perspective on NoiseRobust Speech Recognition
"... Abstract—Model compensation methods for noiserobust speech recognition have shown good performance. Predictive linear transformations can approximate these methods to balance computational complexity and compensation accuracy. This paper examines both of these approaches from a variational perspect ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—Model compensation methods for noiserobust speech recognition have shown good performance. Predictive linear transformations can approximate these methods to balance computational complexity and compensation accuracy. This paper examines both of these approaches from a variational perspective. Using a matchedpair approximation at the component level yields a number of standard forms of model compensation and predictive linear transformations. However, a tighter bound can be obtained by using variational approximations at the state level. Both modelbased and predictive linear transform schemes can be implemented in this framework. Preliminary results show that the tighter bound obtained from the statelevel variational approach can yield improved performance over standard schemes. θt−1 kt−1 xt−1 yt−1 nt−1 θt kt xt yt nt θt−1 mt−1 yt−1 θt mt yt I.
unknown title
"... Exemplarbased sparse representations for noise robust automatic speech recognition ..."
Abstract
 Add to MetaCart
(Show Context)
Exemplarbased sparse representations for noise robust automatic speech recognition