Results 1  10
of
32
Support vector machines for speech recognition
 Proceedings of the International Conference on Spoken Language Processing
, 1998
"... Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative informati ..."
Abstract

Cited by 114 (2 self)
 Add to MetaCart
Statistical techniques based on hidden Markov Models (HMMs) with Gaussian emission densities have dominated signal processing and pattern recognition literature for the past 20 years. However, HMMs trained using maximum likelihood techniques suffer from an inability to learn discriminative information and are prone to overfitting and overparameterization. Recent work in machine learning has focused on models, such as the support vector machine (SVM), that automatically control generalization and parameterization as part of the overall optimization process. In this paper, we show that SVMs provide a significant improvement in performance on a static pattern classification task based on the Deterding vowel data. We also describe an application of SVMs to large vocabulary speech recognition, and demonstrate an improvement in error rate on a continuous alphadigit task (OGI Aphadigits) and a large vocabulary conversational speech task (Switchboard). Issues related to the development and optimization of an SVM/HMM hybrid system are discussed.
Speech Recognition using SVMs
 Advances in Neural Information Processing Systems 14
, 2002
"... An important issue in applying SVMs to speech recognition is the ability to classify variable length sequences. This paper presents extensions to a standard scheme for handling this variable length data, the Fisher score. A more useful mapping is introduced based on the likelihoodratio. The sco ..."
Abstract

Cited by 81 (17 self)
 Add to MetaCart
(Show Context)
An important issue in applying SVMs to speech recognition is the ability to classify variable length sequences. This paper presents extensions to a standard scheme for handling this variable length data, the Fisher score. A more useful mapping is introduced based on the likelihoodratio. The scorespace de ned by this mapping avoids some limitations of the Fisher score. Classconditional generative models are directly incorporated into the de nition of the scorespace. The mapping, and appropriate normalisation schemes, are evaluated on a speakerindependent isolated letter task where the new mapping outperforms both the Fisher score and HMMs trained to maximise likelihood.
Speaker verification using sequence discriminant support vector machines
 IEEE Transactions on Speech and Audio Processing
, 2005
"... ..."
Support vector machines for segmental minimum bayes risk decoding of continuous speech
 In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU
, 2003
"... Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic for ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
(Show Context)
Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic formulation are binary classifiers of fixed dimensional observations, can be used for continuous speech recognition. We also study the use of GiniSVMs, which is a variant of the basic SVM. On a small vocabulary task, we show this two pass scheme outperforms MMI trained HMMs. Using system combination we also obtain further improvements over discriminatively trained HMMs. 1.
Evaluation of Kernel Methods for Speaker Verification and Identification
, 2002
"... Support vector machines are evaluated on speaker verification and speaker identification tasks. We compare the polynomial kernel, the Fisher kernel, a likelihood ratio kernel and the pair hidden Markov model kernel with baseline systems based on a discriminative polynomial classifier and generative ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
Support vector machines are evaluated on speaker verification and speaker identification tasks. We compare the polynomial kernel, the Fisher kernel, a likelihood ratio kernel and the pair hidden Markov model kernel with baseline systems based on a discriminative polynomial classifier and generative Gaussian mixture model classifiers. Simulations were carried out on the YOHO database and some promising results were obtained.
ClusteringBased Construction of Hidden Markov Models for Generative Kernels
"... Abstract. Generative kernels represent theoretically grounded tools able to increase the capabilities of generative classification through a discriminative setting. Fisher Kernel is the first and mostlyused representative, which lies on a widely investigated mathematical background. The manufacture ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
(Show Context)
Abstract. Generative kernels represent theoretically grounded tools able to increase the capabilities of generative classification through a discriminative setting. Fisher Kernel is the first and mostlyused representative, which lies on a widely investigated mathematical background. The manufacture of a generative kernel flows down through a twostep serial pipeline. In the first, “generative ” step, a generative model is trained, considering one model for class or a whole model for all the data; then, features or scores are extracted, which encode the contribution of each data point in the generative process. In the second, “discriminative ” part, the scores are evaluated by a discriminative machine via a kernel, exploiting the data separability. In this paper we contribute to the first aspect, proposing a novel way to fit the classdata with the generative models, in specific, focusing on Hidden Markov Models (HMM). The idea is to perform model clustering on the unlabeled data in order to discover at best the structure of the entire sample set. Then, the label information is retrieved and generative scores are computed. Experimental, comparative test provides a preliminary idea on the goodness of the novel approach, pushing forward for further developments. 1
UBMGMM DRIVEN DISCRIMINATIVE APPROACH FOR SPEAKER VERIFICATION
"... In the past few years, discriminative approaches to perform speaker detection have shown good results and an increasing interest. Among these methods, SVM based systems have lots of advantages, especially their ability to deal with a high dimension feature space. Generative systems such as UBMGMM s ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
In the past few years, discriminative approaches to perform speaker detection have shown good results and an increasing interest. Among these methods, SVM based systems have lots of advantages, especially their ability to deal with a high dimension feature space. Generative systems such as UBMGMM systems show the greatest performance among other systems in speaker verification tasks. Combination of generative and discriminative approaches is not a new idea and has been studied several times by mapping a whole speech utterance onto a fixed length vector. This paper presents a straightforward, cost friendly method to combine the two approaches with the use of a UBM model only to drive the experiment. We show that the use of the TFLLR kernel, while closely related to a reduced form of the Fisher mapping, implies a performance that is close to a standard GMM/UBM based speaker detection system. Moreover, we show that a combination of both outperforms the systems taken independently. 1.
Digit Recognition In Noisy Environments Via A Sequential GMM/SVM System
 In ICASSP (submitted
, 2002
"... This paper exploits the fact that when GMM and SVM classifiers with roughly the same level of performance exhibit uncorrelated errors they can be combined to produce a better classifier. The gain accrues from combining the descriptive strength of GMM models with the discriminative power of SVM class ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
This paper exploits the fact that when GMM and SVM classifiers with roughly the same level of performance exhibit uncorrelated errors they can be combined to produce a better classifier. The gain accrues from combining the descriptive strength of GMM models with the discriminative power of SVM classifiers. This idea, first exploited in the context of speaker recognition [1, 2], is applied to speech recognition  specifically to a digit recognition task in a noisy environment  with significant gains in performance.
DETAC: A Discriminative Criterion for Speaker Verification
, 2002
"... This paper introduces a general criterion applicable to discriminative training of detection systems, and discusses its particular implementation in GMMbased textindependent speaker verification. Based on an analysis of the detection error tradeo# curve of a baseline system, we argue that the new ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper introduces a general criterion applicable to discriminative training of detection systems, and discusses its particular implementation in GMMbased textindependent speaker verification. Based on an analysis of the detection error tradeo# curve of a baseline system, we argue that the new criterion extends several conventional methods such as the maximum posterior training by logistic regression and the linear discriminative analysis projection, by a second aspect  "reshaping" the Bayes error area in favor of a relevant operating range. Optimization results with relative error reduction of up to 16% are presented on the cellular task of the NIST2001 speaker recognition evaluation.
Enhancing Gmm Scores Using Svm "hints"
, 2001
"... This paper proposes a classification scheme that combines statistical models and support vector machines. It exploits the fact (observed in [1]) that GMM and SVM classifiers with roughly the same level of performance produce uncorrelated errors. We describe a novel scheme which employs an SVM classi ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
This paper proposes a classification scheme that combines statistical models and support vector machines. It exploits the fact (observed in [1]) that GMM and SVM classifiers with roughly the same level of performance produce uncorrelated errors. We describe a novel scheme which employs an SVM classifier as an "advisor" to the GMM classifier in uncertain cases. The utility of the combined generative/discriminative approach is demonstrated on standard textindependent speaker verification and speaker identification tasks in matched and mismatched training and test conditions. Results indicate significant improvements in performance without much computational overhead. 1.