Results 1  10
of
17
On adaptive decision rules and decision parameter adaptation for automatic speech recognition
 Proc. IEEE
, 2000
"... Recent advances in automatic speech recognition are accomplished by designing a plugin maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
(Show Context)
Recent advances in automatic speech recognition are accomplished by designing a plugin maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximumlikelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for highperformance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine
Convergence theorems for generalized alternating minimization procedures
 Journal of Machine Learning Research
, 2005
"... The EM algorithm is widely used to develop iterative parameter estimation procedures for statistical models. In cases where these procedures strictly follow the EM formulation, the convergence properties of the estimation procedures are well understood. In some instances there are practical reasons ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
The EM algorithm is widely used to develop iterative parameter estimation procedures for statistical models. In cases where these procedures strictly follow the EM formulation, the convergence properties of the estimation procedures are well understood. In some instances there are practical reasons to develop procedures that do not strictly fall within the EM framework. We study EM variants in which the Estep is not performed exactly, either to obtain improved rates of convergence, or due to approximations needed to compute statistics under a model family over which Esteps cannot be realized. Since these variants are not EM procedures, the standard (G)EM convergence results do not apply to them. We present an information geometric framework for describing such algorithms and analyzing their convergence properties. We apply this framework to analyze the convergence properties of incremental EM and variational EM. For incremental EM, we discuss conditions under these algorithms converge in likelihood. For variational EM, we show how the Estep approximation prevents convergence to local maxima in likelihood.
Online Bayesian treestructured transformation of HMMs with optimal model selection for speaker adaptation
 IEEE Transactions on Speech and Audio Processing
"... ..."
(Show Context)
Incremental Estimation of Discrete Hidden Markov Models Based on a New Backward Procedure
"... We address the problem of learning discrete hidden Markov models from very long sequences of observations. Incremental versions of the BaumWelch algorithm that approximate the βvalues used in the backward procedure are commonly used for this problem, since their memory complexity is independent of ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
We address the problem of learning discrete hidden Markov models from very long sequences of observations. Incremental versions of the BaumWelch algorithm that approximate the βvalues used in the backward procedure are commonly used for this problem, since their memory complexity is independent of the sequence length. We introduce an improved incremental BaumWelch algorithm with a new backward procedure that approximates the βvalues based on a onestep lookahead in the training sequence. We justify the new approach analytically, and report empirical results that show it converges faster than previous incremental algorithms.
Robust TimeSynchronous Environmental Adaption for Continuous . . .
 INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING
, 2002
"... In this paper we describe system architectures for robust MLLR based environmental adaptation of continuous speech recognition systems. Inspired by an existing broadcast news transcription system [1] we refined the identification of acoustic scenarios by using a combined GMM/HMM method. Thus environ ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
In this paper we describe system architectures for robust MLLR based environmental adaptation of continuous speech recognition systems. Inspired by an existing broadcast news transcription system [1] we refined the identification of acoustic scenarios by using a combined GMM/HMM method. Thus environmental adaptation regarding arbitrary acoustic scenarios beyond speaker changes becomes possible. For deploying acoustic adaptation in interactive applications, such as human machine interaction, a timesynchronous adaptation approach is proposed. For different corpora the evaluation of our approaches shows significant improvements in recognition accuracy while satisfying the constraint of timesynchronous processing.
Combining Eigenvoices and Structural MLLR for Speaker Adaptation
, 2003
"... This paper considers the problem of speaker adaptation of acoustic models in speech recognition. We have investigated four different possible methods which integrate the concepts of both Structural Maximum Likelihood Linear Regression (SMLLR) and EigenVoices technique to adapt the Gaussian means ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
This paper considers the problem of speaker adaptation of acoustic models in speech recognition. We have investigated four different possible methods which integrate the concepts of both Structural Maximum Likelihood Linear Regression (SMLLR) and EigenVoices technique to adapt the Gaussian means of the speaker independant models for a new speaker. The experiments were evaluated using the speech recognition engine ESPERE on the data of the corpus Resource Management. They show that all of the proposed methods can improve the performances of an ASRS in supervised batch adaptation as efficiently as SMLLR and EigenVoicesbased techniques whatever the amount of adaptation data is available. For an unsupervised incremental adaptation, only the approaches SMLLR!EV and SMLLR!SEV seemed to give the best results. 1.
Adaptive learning and compensation of hidden Markov model for robust speech recognition," invited tutorial
 Proc. 1998 International Symposium on Chinese Spoken Language Processing (Singapore
, 1998
"... In this report, we start with a revisit to the statistical formulation of the automatic speech recognition (ASR) problem, and identify the factors which might in uence the performance of the conventional plugin MAP decision rule for ASR. We summarize our recent research e orts on a class of robust ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
In this report, we start with a revisit to the statistical formulation of the automatic speech recognition (ASR) problem, and identify the factors which might in uence the performance of the conventional plugin MAP decision rule for ASR. We summarize our recent research e orts on a class of robust speech recognition problems in which mismatches between training and testing conditions exist but an accurate knowledge of the mismatch mechanism is unknown. The only available information is the test data along with a set of pretrained speech models and the decision parameters. We focus on two types of Bayesian techniques, namely online Bayesian adaptation of hidden Markov model parameters and the Bayesian predictive classi cation approach. We conclude the report with a brief mention of our ongoing research e orts towards a robust and intelligent spoken dialogue system. 1.
FAST INCREMENTAL LEARNING OF STOCHASTIC CONTEXTFREE GRAMMARS IN RADAR ELECTRONIC SUPPORT
, 2006
"... Radar Electronic Support (ES) involves the passive search for, interception, location, analysis and identification ofradiated electromagnetic energy for military purposes. Although Stochastic ContextFree Grammars (SCFGs) appear promising for recognition of radar emitters, and for estimation of the ..."
Abstract
 Add to MetaCart
(Show Context)
Radar Electronic Support (ES) involves the passive search for, interception, location, analysis and identification ofradiated electromagnetic energy for military purposes. Although Stochastic ContextFree Grammars (SCFGs) appear promising for recognition of radar emitters, and for estimation of their respective level of threat in radar ES systems, the computations associated with wellknown techniques for learning their production rule probabilities are very computationally demanding. The most popular methods for this task are the InsideOutside (IO) algorithm, which maximizes the likelihood of a data set, and the Viterbi Score (VS) algorithm, which maximizes the likelihood of its best parse trees. For each iteration, their time complexity is cubic with the length of sequences in the training set and with the number of nonterminal symbols in the grammar. Since appli
Abstract ANOMALY DETECTION FROM PERSONAL USAGE PATTERNS IN WEB APPLICATIONS
, 2006
"... This is to certify that we have read this thesis and that in our opinion it is fully ..."
Abstract
 Add to MetaCart
(Show Context)
This is to certify that we have read this thesis and that in our opinion it is fully
Improving EigenVoicesbased techniques and SMLLR for Speaker Adaptation by combining EV and SMLLR techniques or using Genetic Algorithms
"... This paper constitutes a study of several classical and original methods for a speaker adaptation of the acoustic hidden Markov models of an automatic speech recognition system (ASRS). Most of today’s real applications require that the speaker adaptation process continuously improves the performance ..."
Abstract
 Add to MetaCart
(Show Context)
This paper constitutes a study of several classical and original methods for a speaker adaptation of the acoustic hidden Markov models of an automatic speech recognition system (ASRS). Most of today’s real applications require that the speaker adaptation process continuously improves the performance of the underlying ASRS, as more utterances are pronounced by a new speaker. The first part of this article is dedicated to this problem. We begin by introducing the Structural EigenVoices approach (SEV). Compared to EigenVoices (EV), SEV improves the performance of an ASRS with more sentences, well beyond the point where the EV system has reached its limit. We then describe four methods that combine the advantages of Structural Maximum Likelihood Linear Regression (SMLLR) and EigenVoicesbased techniques (EV or SEV). We show experimentally that one of them, SEV→SMLLR, can improve the performance of an ASRS at least as significantly as SMLLR, EV and SEV, irrespective of the amount of adaptation utterances used. The second part of our work is focused on the use of genetic algorithms for rapidly adapting acoustic models. Whereas all of the standard adaptation methods (e.g. SMLLR, SMAP, EV, etc.) are based on the E.M. procedure and thus provide a single local optimal solution, genetic algorithms are theoretically able to provide several global optimal solutions. We experimentally show that: (1) genetic algorithms and EV both equivalently improve the performance of an ASRS, and (2) combining genetic algorithms and EV further improves the performance of an ASRS.