Results 1  10
of
34
LANDMARKBASED SPEECH RECOGNITION: REPORT OF THE 2004 Johns Hopkins Summer Workshop
, 2005
"... ..."
Augmented statistical models for speech recognition
 in Proc. ICASSP
, 2006
"... ..."
(Show Context)
Discriminative classifiers with adaptive kernels for noise robust speech recognition
 Comput. Speech Lang
, 2010
"... Discriminative classifiers are a popular approach to solving classification problems. However one of the problems with these approaches, in particular kernel based classifiers such as Support Vector Machines (SVMs), is that they are hard to adapt to mismatches between the training and test data. Thi ..."
Abstract

Cited by 24 (18 self)
 Add to MetaCart
(Show Context)
Discriminative classifiers are a popular approach to solving classification problems. However one of the problems with these approaches, in particular kernel based classifiers such as Support Vector Machines (SVMs), is that they are hard to adapt to mismatches between the training and test data. This paper describes a scheme for overcoming this problem for speech recognition in noise by adapting the kernel rather than the SVM decision boundary. Generative kernels, defined using generative models, are one type of kernel that allows SVMs to handle sequence data. By compensating the parameters of the generative models for each noise condition noisespecific generative kernels can be obtained. These can be used to train a noiseindependent SVM on a range of noise conditions, which can then be used with a testset noise kernel for classification. The noisespecific kernels used in this paper are based on Vector Taylor Series (VTS) modelbased compensation. VTS allows all the model parameters to be compensated and the background noise to be estimated in a maximum likelihood fashion. A brief discussion of VTS, and the optimisation of the mismatch function representing the impact of noise on the clean speech, is also included. Experiments using these VTSbased testset noise kernels were run on the AURORA 2 continuous digit task. The proposed SVM rescoring scheme yields large gains in performance over the VTS compensated models. Key words: speech recognition, noise robustness, support vector machines, generative kernels
Augmented Statistical Models for Classifying Sequence Data
, 2006
"... Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two journal articles [36,68], two workshop papers [35,67] and a technical report [65]. The length of this thesis including appendices, bibliography, footnotes, tables and equations is approximately 60,000 words. This thesis contains 27 figures and 20 tables. i
Discriminative models for speech recognition
 In Information Theory and Applications Workshop
, 1997
"... Abstract — The vast majority of automatic speech recognition systems use Hidden Markov Models (HMMs) as the underlying acoustic model. Initially these models were trained based on the maximum likelihood criterion. Significant performance gains have been obtained by using discriminative training crit ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
(Show Context)
Abstract — The vast majority of automatic speech recognition systems use Hidden Markov Models (HMMs) as the underlying acoustic model. Initially these models were trained based on the maximum likelihood criterion. Significant performance gains have been obtained by using discriminative training criteria, such as maximum mutual information and minimum phone error. However, the underlying acoustic model is still generative, with the associated constraints on the state and transition probability distributions, and classification is based on Bayes ’ decision rule. Recently, there has been interest in examining discriminative, or direct, models for speech recognition. This paper briefly reviews the forms of discriminative models that have been investigated. These include maximum entropy Markov models, hidden conditional random fields and conditional augmented models. The relationships between the various models and issues with applying them to large vocabulary continuous speech recognition will be discussed. I.
Derivative kernels for noise robust ASR
 in Proc. of ASRU’11, 2011
"... Abstract—Recently there has been interest in combined generative/discriminative classifiers. In these classifiers features for the discriminative models are derived from generative kernels. One advantage of using generative kernels is that systematic approaches exist how to introduce complex depende ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
(Show Context)
Abstract—Recently there has been interest in combined generative/discriminative classifiers. In these classifiers features for the discriminative models are derived from generative kernels. One advantage of using generative kernels is that systematic approaches exist how to introduce complex dependencies beyond conditional independence assumptions. Furthermore, by using generative kernels modelbased compensation/adaptation techniques can be applied to make discriminative models robust to noise/speaker conditions. This paper extends previous work with combined generative/discriminative classifiers in several directions. First, it introduces derivative kernels based on contextdependent generative models. Second, it describes how derivative kernels can be incorporated in continuous discriminative models. Third, it addresses the issues associated with large number of classes and parameters when contextdependent models and highdimensional features of derivative kernels are used. The approach is evaluated on two noisecorrupted tasks: small vocabulary AURORA 2 and mediumtolarge vocabulary AURORA 4 task. I.
Acoustic modelling using continuous rational kernels
 in MLSP
, 2005
"... There has been significant interest in developing alternatives to hidden Markov models (HMMs) for speech recognition. In particular, interest has been focused upon models that allow additional dependencies to be incorporated. One such model is the Augmented Statistical Model. Here a local exponentia ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
(Show Context)
There has been significant interest in developing alternatives to hidden Markov models (HMMs) for speech recognition. In particular, interest has been focused upon models that allow additional dependencies to be incorporated. One such model is the Augmented Statistical Model. Here a local exponential approximation, based upon derivatives of a base distribution, is made about some distribution of the base model. Augmented statistical models can be trained using a maximum margin criterion, which may be implemented using an SVM with a generative kernel. Calculating derivatives of the base distribution, in particular higherorder derivatives, to form the generative kernel requires complex dynamic programming algorithms. In this paper a new form of rational kernel, a continuous rational kernel is proposed. This allows elements of the generative kernel, including those based on higherorder derivatives, to be computed using standard forms of transducer within a rational kernel framework. In addition, the derivatives are shown to be a principled method of defining marginalised kernels. Continuous rational kernels are evaluated using a large vocabulary continuous speech recognition (LVCSR) task. 1.
Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition
, 2004
"... Minimum risk estimation and decoding strategies based on lattice segmentation techniques can be used to refine large vocabulary continuous speech recognition systems through the estimation of the parameters of the underlying hidden Mark models and through the identification of smaller recognition ta ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
Minimum risk estimation and decoding strategies based on lattice segmentation techniques can be used to refine large vocabulary continuous speech recognition systems through the estimation of the parameters of the underlying hidden Mark models and through the identification of smaller recognition tasks which provides the opportunity to incorporate novel modeling and decoding procedures in LVCSR. These techniques are discussed in the context of going ‘beyond HMMs’.
Recognition of dialogue acts in multiparty meetings using a switching DBN
 IEEE Transaction on Audio, Speech, and Language Processing
"... ..."
(Show Context)
Lattice segmentation and support vector machines for large vocabulary continuous speech recognition
 In: Proc. ICASSP
, 2005
"... Lattice segmentation procedures are used to spot possible recognition errors in firstpass recognition hypotheses produced by a large vocabulary continuous speech recognition system. This approach is analyzed in terms of its ability to reliably identify, and provide good alternatives for, incorrectl ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Lattice segmentation procedures are used to spot possible recognition errors in firstpass recognition hypotheses produced by a large vocabulary continuous speech recognition system. This approach is analyzed in terms of its ability to reliably identify, and provide good alternatives for, incorrectly hypothesized words. A procedure is described to train and apply Support Vector Machines to strengthen the first pass system where it was found to be weak, resulting in small but statistically significant recognition improvements on a large test set of conversational speech. 1.