Results 1 -
6 of
6
Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
, 1995
"... ..."
Flexible Speaker Adaptation Using Maximum Likelihood Linear Regression
- Proc. ARPA Spoken Language Technology Workshop
, 1995
"... The maximum likelihood linear regression (MLLR) approach for speaker adaptation of continuous density mixture Gaussian HMMs is presented and its application to static and incremental adaptation for both supervised and unsupervised modes described. The approach involves computing a transformation for ..."
Abstract
-
Cited by 62 (2 self)
- Add to MetaCart
The maximum likelihood linear regression (MLLR) approach for speaker adaptation of continuous density mixture Gaussian HMMs is presented and its application to static and incremental adaptation for both supervised and unsupervised modes described. The approach involves computing a transformation for the mixture component means using linear regression. To allow adaptation to be performed with limited amounts of data, a small number of transformations are defined and each one is tied to a number of component mixtures. In previous work, the tyings were predetermined based on the amount of available data. Recently we have used dynamic regression class generation which chooses the appropriate number of classes and transform tying during the adaptation phase. This allows complete unsupervised operation with arbitrary adaptation data. Results are given for static supervised adaptation for non-native speakers and also unsupervised incremental adaptation. Both show the effectiveness and flexibi...
Frame-Discriminative And Confidence-Driven Adaptation For LVCSR
, 2000
"... Maximum Likelihood Linear Regression (MLLR) has become the most popular approach for adapting speakerindependent Hidden Markov Models to a specic speaker's characteristics. However, it is well known, that discriminative training objectives outperform Maximum Likelihood training approaches, especiall ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Maximum Likelihood Linear Regression (MLLR) has become the most popular approach for adapting speakerindependent Hidden Markov Models to a specic speaker's characteristics. However, it is well known, that discriminative training objectives outperform Maximum Likelihood training approaches, especially in cases where training data is very limited, as it always is the case in adaptation tasks. Therefore, this paper explores the application of a framebased discriminative training objective for adaptation. It presents evaluations for supervised as well as for unsupervised adaption on the 1993 WSJ adaptation tests of native and non-native speakers. Relative improvements in word error rate of up to 25% could be measured compared to the MLLR adapted recognition systems. Along with unsupervised adaptation, the paper also presents the improvements achieved by the application of condence measures. They provided an average relative improvement of 10% compared to ordinary unsupervised MLLR. 1. I...
Speaker Verification Over The Telephone
- SPEECH COMMUNICATION
, 2000
"... The aim of the research reported in this paper was to assess the capability of state-of-the-art methods for speaker verification in order to determine if high enough performance levels could be obtained to support the development of telecom applications. This experimental study quantified speaker re ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
The aim of the research reported in this paper was to assess the capability of state-of-the-art methods for speaker verification in order to determine if high enough performance levels could be obtained to support the development of telecom applications. This experimental study quantified speaker recognition performance out of the context of any specific application, as a function of factors more-or-less acknowledged to affect the accuracy. Some issues investigated are: the speaker model (Gaussian mixture models are compared with phonebased models), the influence of the amount and content of training and test data on performance; performance degradation due to model ageing and how can this be counteracted by using adaptation techniques; achievable performance levels using text-dependent and textindependent recognition modes. In particular the effect of linguistic content on performance is shown for both read and spontaneous speech. These and other factors were addressed using a large corpus of read and spontaneous speech (over 2000 hours collected from 100 target speakers and 1000 impostors) in French designed and recorded for the purpose of this study. On this data, the lowest equal error rate is 1% for the text-dependent mode when 2 trials are allowed per attempt and with a minimum of 1.5s of speech per trial.
Scaled Likelihood Linear Regression for Hidden Markov Model Adaptation
"... In the context of continuous Hidden Markov Model (HMM) based speech-recognition, linear regression approaches have become popular to adapt the acoustic models to the specific speaker's characteristics. The well known Maximum Likelihood Linear Regression (MLLR) [1] and Maximum A Posteriori Linear Reg ..."
Abstract
- Add to MetaCart
In the context of continuous Hidden Markov Model (HMM) based speech-recognition, linear regression approaches have become popular to adapt the acoustic models to the specific speaker's characteristics. The well known Maximum Likelihood Linear Regression (MLLR) [1] and Maximum A Posteriori Linear Regression (MAPLR) [2] are just two of them, which differ primarily in the training objective they are maximizing.
Task Adaptation For Dialogues Via Telephone Lines
"... This paper describes our successful ongoing approaches toward better recognition accuracy for flexible interactive systems in automatic speech recognition. Degradation in performance of speech recognition systems is observed whenever any current application differs from the conditions during trainin ..."
Abstract
- Add to MetaCart
This paper describes our successful ongoing approaches toward better recognition accuracy for flexible interactive systems in automatic speech recognition. Degradation in performance of speech recognition systems is observed whenever any current application differs from the conditions during training time. Main speaker independent causes for these deteriorations are changes in transmission channels and changes in the task to be fulfilled. We present our results of researchonchanging tasks, i.e. more specifically on changing dictionaries. We propose an in-service adaptation technique that is speaker independent, works under unsupervised conditions, and has a long term memory. On 2000 adaptation words a reduction of error rate of more than 40% at negligible computational costs is achieved.

