Results 1  10
of
13
On adaptive decision rules and decision parameter adaptation for automatic speech recognition
 Proc. IEEE
, 2000
"... Recent advances in automatic speech recognition are accomplished by designing a plugin maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
(Show Context)
Recent advances in automatic speech recognition are accomplished by designing a plugin maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximumlikelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for highperformance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine
Convergence theorems for generalized alternating minimization procedures
 Journal of Machine Learning Research
, 2005
"... The EM algorithm is widely used to develop iterative parameter estimation procedures for statistical models. In cases where these procedures strictly follow the EM formulation, the convergence properties of the estimation procedures are well understood. In some instances there are practical reasons ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
The EM algorithm is widely used to develop iterative parameter estimation procedures for statistical models. In cases where these procedures strictly follow the EM formulation, the convergence properties of the estimation procedures are well understood. In some instances there are practical reasons to develop procedures that do not strictly fall within the EM framework. We study EM variants in which the Estep is not performed exactly, either to obtain improved rates of convergence, or due to approximations needed to compute statistics under a model family over which Esteps cannot be realized. Since these variants are not EM procedures, the standard (G)EM convergence results do not apply to them. We present an information geometric framework for describing such algorithms and analyzing their convergence properties. We apply this framework to analyze the convergence properties of incremental EM and variational EM. For incremental EM, we discuss conditions under these algorithms converge in likelihood. For variational EM, we show how the Estep approximation prevents convergence to local maxima in likelihood.
Online Bayesian treestructured transformation of HMMs with optimal model selection for speaker adaptation
 IEEE Transactions on Speech and Audio Processing
"... ..."
(Show Context)
Incremental Estimation of Discrete Hidden Markov Models Based on a New Backward Procedure
"... We address the problem of learning discrete hidden Markov models from very long sequences of observations. Incremental versions of the BaumWelch algorithm that approximate the βvalues used in the backward procedure are commonly used for this problem, since their memory complexity is independent of ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
We address the problem of learning discrete hidden Markov models from very long sequences of observations. Incremental versions of the BaumWelch algorithm that approximate the βvalues used in the backward procedure are commonly used for this problem, since their memory complexity is independent of the sequence length. We introduce an improved incremental BaumWelch algorithm with a new backward procedure that approximates the βvalues based on a onestep lookahead in the training sequence. We justify the new approach analytically, and report empirical results that show it converges faster than previous incremental algorithms.
Robust TimeSynchronous Environmental Adaption for Continuous . . .
 INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING
, 2002
"... In this paper we describe system architectures for robust MLLR based environmental adaptation of continuous speech recognition systems. Inspired by an existing broadcast news transcription system [1] we refined the identification of acoustic scenarios by using a combined GMM/HMM method. Thus environ ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
In this paper we describe system architectures for robust MLLR based environmental adaptation of continuous speech recognition systems. Inspired by an existing broadcast news transcription system [1] we refined the identification of acoustic scenarios by using a combined GMM/HMM method. Thus environmental adaptation regarding arbitrary acoustic scenarios beyond speaker changes becomes possible. For deploying acoustic adaptation in interactive applications, such as human machine interaction, a timesynchronous adaptation approach is proposed. For different corpora the evaluation of our approaches shows significant improvements in recognition accuracy while satisfying the constraint of timesynchronous processing.
Adaptive learning and compensation of hidden Markov model for robust speech recognition," invited tutorial
 Proc. 1998 International Symposium on Chinese Spoken Language Processing (Singapore
, 1998
"... In this report, we start with a revisit to the statistical formulation of the automatic speech recognition (ASR) problem, and identify the factors which might in uence the performance of the conventional plugin MAP decision rule for ASR. We summarize our recent research e orts on a class of robust ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
In this report, we start with a revisit to the statistical formulation of the automatic speech recognition (ASR) problem, and identify the factors which might in uence the performance of the conventional plugin MAP decision rule for ASR. We summarize our recent research e orts on a class of robust speech recognition problems in which mismatches between training and testing conditions exist but an accurate knowledge of the mismatch mechanism is unknown. The only available information is the test data along with a set of pretrained speech models and the decision parameters. We focus on two types of Bayesian techniques, namely online Bayesian adaptation of hidden Markov model parameters and the Bayesian predictive classi cation approach. We conclude the report with a brief mention of our ongoing research e orts towards a robust and intelligent spoken dialogue system. 1.
ISCA Archive A Comparative Study of Several Incremental Adaptation Algorithms for Speaker Adaptation
"... We conduct a comparative study of five representative incremental HMM adaptation algorithms developed in the past few years. We report the experimental results of using these algorithms for online speaker adaptation in a continuous Mandarin Chinese speech recognition system. We identify the strengt ..."
Abstract
 Add to MetaCart
We conduct a comparative study of five representative incremental HMM adaptation algorithms developed in the past few years. We report the experimental results of using these algorithms for online speaker adaptation in a continuous Mandarin Chinese speech recognition system. We identify the strength and weakness of individual algorithms and offer recommendations for practitioners to make intelligent use of these adaptation algorithms for different purposes in different applications. 1.
Contents lists available at SciVerse ScienceDirect
, 2012
"... Keywords: output alphabet is finite, and continuous if the output alphabet is not necessarily finite, e.g., each state is governed by a parametric density function [32,34,80]. Theoretical and empirical results have shown that, given an adequate number of states and a sufficiently rich set of data, ..."
Abstract
 Add to MetaCart
(Show Context)
Keywords: output alphabet is finite, and continuous if the output alphabet is not necessarily finite, e.g., each state is governed by a parametric density function [32,34,80]. Theoretical and empirical results have shown that, given an adequate number of states and a sufficiently rich set of data, HMMs are capable of representing probability distributions corresponding to complex realworld phenomena in terms of 00200255/ $ see front matter 2012 Elsevier Inc. All rights reserved. ⇑ Corresponding author.
LINEAR TRANSFORMS IN AUTOMATIC SPEECH RECOGNITION: ESTIMATION PROCEDURES AND INTEGRATION OF DIVERSE ACOUSTIC DATA
"... Linear transforms have been used extensively for both training and adaptation of Hidden Markov Model (HMM) based automatic speech recognition (ASR) systems. Two important applications of linear transforms in acoustic modeling are the decorrelation of the feature vector and the constrained adaptation ..."
Abstract
 Add to MetaCart
(Show Context)
Linear transforms have been used extensively for both training and adaptation of Hidden Markov Model (HMM) based automatic speech recognition (ASR) systems. Two important applications of linear transforms in acoustic modeling are the decorrelation of the feature vector and the constrained adaptation of the acoustic models to the speaker, the channel, and the task. Our focus in the first part of this talk is the development of training methods based on the Maximum Mutual Information (MMI) and the Maximum A Posteriori (MAP) criterion that estimate the parameters of the linear transforms. We integrate the discriminative linear transforms into the MMI estimation of the HMM parameters in an attempt to capture the correlation between the feature vector components. The transforms obtained under the MMI criterion are termed Discriminative Likelihood Linear Transforms (DLLT). Experimental results show that DLLT provides a discriminative estimation framework for feature normalization in HMM training for large vocabulary continuous speech recognition tasks that outperforms its Maximum Likelihood counterpart. Then, we propose a structural MAP estimation framework
OnLine Bayesian TreeStructured Transformation Of Hidden Markov Models For Speaker Adaptation
 IEEE Trans. Speech and Audio Proc
, 2001
"... This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform (or adapt) the entire set of HMM parameters for a new speaker or new acoustic enviroment from a small amount of adaptation data. By est ..."
Abstract
 Add to MetaCart
This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform (or adapt) the entire set of HMM parameters for a new speaker or new acoustic enviroment from a small amount of adaptation data. By establishing a clustering tree of HMM Gaussian mixture components, the finest affine transformation parameters for individual HMM Gaussian mixture components can be dynamically searched. The online Bayesian learning technique proposed in our recent work is used for recursive maximum a posteriori estimation of affine transformation parameters. Speaker adaptation experiments using a 26letter English alphabet vocabulary are conducted, and the viability of the online learning framework is confirmed. 1. INTRODUCTION Adaptation technique has been widely studied for practical speech recognition systems in the last decade. It can be classified into the following two major approaches: Bayesian appro...