Results 1  10
of
24
Support vector machines for segmental minimum bayes risk decoding of continuous speech
 In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU
, 2003
"... Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic for ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic formulation are binary classifiers of fixed dimensional observations, can be used for continuous speech recognition. We also study the use of GiniSVMs, which is a variant of the basic SVM. On a small vocabulary task, we show this two pass scheme outperforms MMI trained HMMs. Using system combination we also obtain further improvements over discriminatively trained HMMs. 1.
Augmented statistical models for speech recognition
 in Proc. ICASSP
, 2006
"... ..."
(Show Context)
Augmented Statistical Models for Classifying Sequence Data
, 2006
"... Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Declaration This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration. It has not been submitted in whole or in part for a degree at any other university. Some of the work has been published previously in conference proceedings [66,69], two journal articles [36,68], two workshop papers [35,67] and a technical report [65]. The length of this thesis including appendices, bibliography, footnotes, tables and equations is approximately 60,000 words. This thesis contains 27 figures and 20 tables. i
Discriminative models for speech recognition
 In Information Theory and Applications Workshop
, 1997
"... Abstract — The vast majority of automatic speech recognition systems use Hidden Markov Models (HMMs) as the underlying acoustic model. Initially these models were trained based on the maximum likelihood criterion. Significant performance gains have been obtained by using discriminative training crit ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
(Show Context)
Abstract — The vast majority of automatic speech recognition systems use Hidden Markov Models (HMMs) as the underlying acoustic model. Initially these models were trained based on the maximum likelihood criterion. Significant performance gains have been obtained by using discriminative training criteria, such as maximum mutual information and minimum phone error. However, the underlying acoustic model is still generative, with the associated constraints on the state and transition probability distributions, and classification is based on Bayes ’ decision rule. Recently, there has been interest in examining discriminative, or direct, models for speech recognition. This paper briefly reviews the forms of discriminative models that have been investigated. These include maximum entropy Markov models, hidden conditional random fields and conditional augmented models. The relationships between the various models and issues with applying them to large vocabulary continuous speech recognition will be discussed. I.
Structured log linear models for noise robust speech recognition
 Signal Processing Letters, IEEE
, 2010
"... [ The use of discriminative models for structured classification tasks, such as automatic speech recognition is becoming increasingly popular. The major contribution of this work is we proposed a large margin structured loglinear model for noise robust continuous ASR. 1 An important aspect of logl ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
(Show Context)
[ The use of discriminative models for structured classification tasks, such as automatic speech recognition is becoming increasingly popular. The major contribution of this work is we proposed a large margin structured loglinear model for noise robust continuous ASR. 1 An important aspect of loglinear models is the form of the features. The features used in our structured log linear model are derived from generative kernels. This provides an elegant way of combining generative and discriminative models to handle timevarying data. Additionally, since the features are based on the generative models, modelbased compensation can be easily performed for noise robustness. Third, the designed joint feature space can be decomposed at the arc level. This allows efficient decoding and training with lattices, which is important for any larger vocabulary extensions. Previous work in this area is extended in two important directions. First, instead of using CML training which is commonly used for discriminative models, this paper describes efficient large margin training for sentencelevel log linear models based on lattices. Depending on the nature of the joint featurespace and labels, we have proved that this form of model is closely related to structured SVMs and Multiclass SVMs. Second, efficient latticebased classification of continuous data is also performed incorporating a joint feature space. This novel model combines generative kernels, discriminative models, efficient latticebased large margin training and modelbased noise compensation. It is evaluated on a noise corrupted continuous digit task: AURORA 2.0. Results on the AURORA 2 demonstrate that modelling the structure information yields significant improvements.]
Maximum margin training of generative kernels
, 2004
"... Generative kernels, a generalised form of Fisher kernels, are a powerful form of kernel that allow the kernel parameters to be tuned to a specific task. The standard approach to training these kernels is to use maximum likelihood estimation. This paper describes a novel approach based on maximummar ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
Generative kernels, a generalised form of Fisher kernels, are a powerful form of kernel that allow the kernel parameters to be tuned to a specific task. The standard approach to training these kernels is to use maximum likelihood estimation. This paper describes a novel approach based on maximummargin training of both the kernel parameters and a Support Vector Machine (SVM) classifier. It combines standard SVM training with a gradientdescent based kernel parameter optimisation scheme. This allows the kernel parameters to be explicitly trained for the data set and the SVM scorespace. Initial results on an artificial task and the Deterding data show that such an approach can reduce classification error rates. 1 1
Derivative kernels for noise robust ASR
 in Proc. of ASRU’11, 2011
"... Abstract—Recently there has been interest in combined generative/discriminative classifiers. In these classifiers features for the discriminative models are derived from generative kernels. One advantage of using generative kernels is that systematic approaches exist how to introduce complex depende ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
(Show Context)
Abstract—Recently there has been interest in combined generative/discriminative classifiers. In these classifiers features for the discriminative models are derived from generative kernels. One advantage of using generative kernels is that systematic approaches exist how to introduce complex dependencies beyond conditional independence assumptions. Furthermore, by using generative kernels modelbased compensation/adaptation techniques can be applied to make discriminative models robust to noise/speaker conditions. This paper extends previous work with combined generative/discriminative classifiers in several directions. First, it introduces derivative kernels based on contextdependent generative models. Second, it describes how derivative kernels can be incorporated in continuous discriminative models. Third, it addresses the issues associated with large number of classes and parameters when contextdependent models and highdimensional features of derivative kernels are used. The approach is evaluated on two noisecorrupted tasks: small vocabulary AURORA 2 and mediumtolarge vocabulary AURORA 4 task. I.
Acoustic modelling using continuous rational kernels
 in MLSP
, 2005
"... There has been significant interest in developing alternatives to hidden Markov models (HMMs) for speech recognition. In particular, interest has been focused upon models that allow additional dependencies to be incorporated. One such model is the Augmented Statistical Model. Here a local exponentia ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
(Show Context)
There has been significant interest in developing alternatives to hidden Markov models (HMMs) for speech recognition. In particular, interest has been focused upon models that allow additional dependencies to be incorporated. One such model is the Augmented Statistical Model. Here a local exponential approximation, based upon derivatives of a base distribution, is made about some distribution of the base model. Augmented statistical models can be trained using a maximum margin criterion, which may be implemented using an SVM with a generative kernel. Calculating derivatives of the base distribution, in particular higherorder derivatives, to form the generative kernel requires complex dynamic programming algorithms. In this paper a new form of rational kernel, a continuous rational kernel is proposed. This allows elements of the generative kernel, including those based on higherorder derivatives, to be computed using standard forms of transducer within a rational kernel framework. In addition, the derivatives are shown to be a principled method of defining marginalised kernels. Continuous rational kernels are evaluated using a large vocabulary continuous speech recognition (LVCSR) task. 1.
Structured discriminative models for noise robust continuous speech recognition
 in Proc. ICASSP, Prague, Czech Repubic
, 2011
"... Recently there has been interest in structured discriminative models for speech recognition. In these models sentence posteriors are directly modelled, given a set of features extracted from the observation sequence, and hypothesised word sequence. In previous work these discriminative models have b ..."
Abstract

Cited by 11 (8 self)
 Add to MetaCart
(Show Context)
Recently there has been interest in structured discriminative models for speech recognition. In these models sentence posteriors are directly modelled, given a set of features extracted from the observation sequence, and hypothesised word sequence. In previous work these discriminative models have been combined with features derived from generative models for noiserobust speech recognition for continuous digits. This paper extends this work to medium to large vocabulary tasks. The form of the scorespace extracted using the generative models, and parameter tying of the discriminative model, are both discussed. Update formulae for both conditional maximum likelihood and minimum Bayes ’ risk training are described. Experimental results are presented on small and medium to large vocabulary noisecorrupted speech recognition
SVMs, scorespaces and maximum margin statistical models
 in Beyond HMM workshop, ATR
, 2004
"... There has been significant interest in developing new forms of acoustic model, in particular models which allow additional dependencies to be represented than allowed within a standard hidden Markov model (HMM). This paper discusses one such class of models, augmented statistical models. Here a loca ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
(Show Context)
There has been significant interest in developing new forms of acoustic model, in particular models which allow additional dependencies to be represented than allowed within a standard hidden Markov model (HMM). This paper discusses one such class of models, augmented statistical models. Here a locally exponential approximation is made about some point on a base distribution. This allows additional dependencies within the data to be modelled than are represented in the base distribution. Augmented models based on Gaussian mixture models (GMMs) and HMMs are briefly described. These augmented models are then related to generative kernels, one approach used for allowing support vector machines (SVMs) to be applied to variable length data. The training of augmented statistical models within an SVM, generative kernel, framework is then discussed. This may be viewed as using maximum margin training to estimate statistical models. Augmented Gaussian mixture models are then evaluated using rescoring on a large vocabulary speech recognition task. 1.