Results 1 - 10
of
17
Speaker verification using Adapted Gaussian mixture models
- Digital Signal Processing
, 2000
"... In this paper we describe the major elements of MIT Lincoln Laboratory’s Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs). The system is built around the likelihood ratio test for verification, using simple but ef ..."
Abstract
-
Cited by 1010 (42 self)
- Add to MetaCart
(Show Context)
In this paper we describe the major elements of MIT Lincoln Laboratory’s Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs). The system is built around the likelihood ratio test for verification, using simple but effective GMMs for likelihood functions, a universal background model (UBM) for alternative speaker representation, and a form of Bayesian adaptation to derive speaker models from the UBM. The development and use of a handset detector and score normalization to greatly improve verification performance is also described and discussed. Finally, representative performance benchmarks and system behavior experiments on NIST SRE corpora are presented. © 2000 Academic Press Key Words: speaker recognition; Gaussian mixture models; likelihood ratio detector; universal background model; handset normalization; NIST evaluation. 1.
Person identification using multiple cues
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
"... Abstract-This paper presents a person identification system based on acoustic and visual features. The system is organized as a set of non-homogeneous classifiers whose outputs are integrated after a normalization step. In particular, two classifiers based on acoustic features and three based on vis ..."
Abstract
-
Cited by 217 (1 self)
- Add to MetaCart
(Show Context)
Abstract-This paper presents a person identification system based on acoustic and visual features. The system is organized as a set of non-homogeneous classifiers whose outputs are integrated after a normalization step. In particular, two classifiers based on acoustic features and three based on visual ones provide data for an integration module whose performance is evaluated. A novel technique for the integration of multiple classifiers at an hybrid ranWmeasurement level is introduced using HyperBF networks. Two different methods for the rejection of an unknown person are introduced. The performance of the integrated system is shown to be superior to that of the acoustic and visual subsystems. The resulting identification system can be used to log personal access and, with minor modifications, as an identity verification system. Index Tenns-Template matching, robust statistics, correlation, face recognition, speaker recognition, learning, classification. I.
A Tutorial on Text-Independent Speaker Verification
- EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING 2004:4, 430–451
, 2004
"... This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, ..."
Abstract
-
Cited by 138 (13 self)
- Add to MetaCart
This paper presents an overview of a state-of-the-art text-independent speaker verification system. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Gaussian mixture modeling, which is the speaker modeling technique used in most systems, is then explained. A few speaker modeling alternatives, namely, neural networks and support vector machines, are mentioned. Normalization of scores is then explained, as this is a very important step to deal with real-world data. The evaluation of a speaker verification system is then detailed, and the detection error trade-off (DET) curve is explained. Several extensions of speaker verification are then enumerated, including speaker tracking and segmentation by speakers. Then, some applications of speaker verification are proposed, including on-site applications, remote applications, applications relative to structuring audio information, and games. Issues concerning the forensic area are then recalled, as we believe it is very important to inform people about the actual performance and limitations of speaker verification systems. This paper concludes by giving a
Fifty years of progress in speech and speaker recognition
- Proc. 148th ASA Meeting, 2004
"... Research in automatic speech and speaker recognition has now spanned five decades. This paper surveys the major themes and advances made in the past fifty years of research so as to provide a technological perspective and an appreciation of the fundamental progress that has been accomplished in this ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
(Show Context)
Research in automatic speech and speaker recognition has now spanned five decades. This paper surveys the major themes and advances made in the past fifty years of research so as to provide a technological perspective and an appreciation of the fundamental progress that has been accomplished in this important area of speech communication. Although many techniques have been developed, many challenges have yet to be overcome before we can achieve the ultimate goal of creating machines that can communicate naturally with people. Such a machine needs to be able to deliver a satisfactory performance under a broad range of operating conditions. A much greater understanding of the human speech process is required before automatic speech and speaker recognition systems can approach human performance. 1.
A FUZZY APPROACH TO SPEAKER VERIFICATION
, 2002
"... This paper proposes a fuzzy approach to speaker verification. For an input utterance and a claimed identity, most of the current methods compute a claimed speaker’s score, which is the ratio of the claimed speaker’s and the impostors’ likelihood functions, and compare this score with a given thresho ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper proposes a fuzzy approach to speaker verification. For an input utterance and a claimed identity, most of the current methods compute a claimed speaker’s score, which is the ratio of the claimed speaker’s and the impostors’ likelihood functions, and compare this score with a given threshold to accept or reject this speaker. Considering the speaker verification problem based on fuzzy set theory, the claimed speaker’s score is viewed as the fuzzy membership function of the input utterance in the claimed speaker’s fuzzy set of utterances. Fuzzy entropy and fuzzy c-means membership functions are proposed as fuzzy membership scores, which are the ratios of functions of the claimed speaker’s and impostors’ likelihood functions. A likelihood transformation is also considered to relate current likelihood and fuzzy membership scores. We also proposed fuzzy scores using membership functions similar to those produced by noise-clustering-based method. This noise clustering concept provides very effective modications to several methods, which can overcome some of the problems of ratio-type scores and greatly reduce the false acceptance rate. Experiments were performed to evaluate proposed normalization methods for speaker verifiation using the YOHO corpus. Experiments demonstrate that fuzzy methods and their noise clustering versions outperform conventional methods.
Voice Recognition with Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms
"... Abstract—We describe in this paper the use of neural networks, fuzzy logic and genetic algorithms for voice recognition. In particular, we consider the case of speaker recognition by analyzing the sound signals with the help of intelligent techniques, such as the neural networks and fuzzy systems. W ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—We describe in this paper the use of neural networks, fuzzy logic and genetic algorithms for voice recognition. In particular, we consider the case of speaker recognition by analyzing the sound signals with the help of intelligent techniques, such as the neural networks and fuzzy systems. We use the neural networks for analyzing the sound signal of an unknown speaker, and after this first step, a set of type-2 fuzzy rules is used for decision making. We need to use fuzzy logic due to the uncertainty of the decision process. We also use genetic algorithms to optimize the architecture of the neural networks. We illustrate our approach with a sample of sound signals from real speakers in our institution.
Comparison Of Different Hmm Based Methods For Speaker Verification
"... Three different speaker verification methods are described. All of them are based on Hidden Markov Models (HMM); the first one is of type text independent the other two are of type text prompted. The text independent method makes use of a single state Continuous HMM, to represent each customer in th ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Three different speaker verification methods are described. All of them are based on Hidden Markov Models (HMM); the first one is of type text independent the other two are of type text prompted. The text independent method makes use of a single state Continuous HMM, to represent each customer in the system, while the text prompted methods require to use speaker dependent phoneme models. To model phonemes both Continuous HMMs and SemiContinuous HMMs are used. Two different normalization methods for the likelihood values provided by the various HMMs are considered: one is based on the posterior probability, the other is based on the application of a mapping function. 1. INTRODUCTION In the paper three different approaches, based on the Hidden Markov Model (HMM) technology, will be described and compared for a speaker verification task. The task consists in accepting or rejecting the identity of a claimed speaker according to the individual information contained in a given input uttera...
Speech and Speaker Recognition Evaluation
- In
, 2008
"... Abstract This chapter overviews techniques for evaluating speech and speaker recognition systems. The chapter first describes principles of recognition methods, and specifies types of systems as well as their applications. The evaluation methods can be classified into subjective and objective metho ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract This chapter overviews techniques for evaluating speech and speaker recognition systems. The chapter first describes principles of recognition methods, and specifies types of systems as well as their applications. The evaluation methods can be classified into subjective and objective methods, among which the chapter focuses on the latter methods. In order to compare/normalize performances of different speech recognition systems, test set perplexity is introduced as a measure of the difficulty of each task. Objective evaluation methods of spoken dialogue and transcription systems are respectively described. Speaker recognition can be classified into speaker identification and verification, and most of the application systems fall into the speaker verification category. Since variation of speech features over time is a serious problem in speaker recognition, normalization and adaptation techniques are also described. Speaker verification performance is typically measured by equal error rate, detection error trade-off (DET) curves, and a weighted cost value. The chapter concludes by summarizing various issues for future research.
SPEAKER RECOGNITION USING AN INTELLIGENT APPROACH
"... Abstract—This paper describes the analysis of sound signals with the help of intelligent techniques, such as the neural networks and fuzzy systems for specific speaker recognition. In the first step, we use the neural networks for analyzing the sound signal of an unknown speaker, and then, a set of ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—This paper describes the analysis of sound signals with the help of intelligent techniques, such as the neural networks and fuzzy systems for specific speaker recognition. In the first step, we use the neural networks for analyzing the sound signal of an unknown speaker, and then, a set of type-2 fuzzy rules are used for decision making. Here we use fuzzy logic due to the uncertainty of the decision process. And to optimize the architecture of the neural networks we make use of genetic algorithms. In this study, we illustrate our approach with a sample of sound signals from real speakers in our institution. Keywords-Recognition,Fuzzysystems,Cepstralcoefficients,Speaker,identification,Linearpredictiv e coding.
by
, 2012
"... ii This dissertation focuses on determining specific vowel phonemes which work best for speaker identification and speaker verification, and also developing new algorithms to improve speaker identification accuracy. Results from the first part of our research indicate that the vowels /i/, /E / and / ..."
Abstract
- Add to MetaCart
(Show Context)
ii This dissertation focuses on determining specific vowel phonemes which work best for speaker identification and speaker verification, and also developing new algorithms to improve speaker identification accuracy. Results from the first part of our research indicate that the vowels /i/, /E / and /u / were the ones having the highest recognition scores for both the Gaussian mixture model (GMM) and vector quantization (VQ) methods (at most one classification error). For VQ, /i/, /I/, /e/, /E / and /@ / had no classification errors. Persons speaking /E/, /o / and /u / have been verified well by both GMM and VQ methods in our experiments. For VQ, the verification results are consistent with the identification results since the same five phonemes performed the best and had less than one verification error. After determining several ideal vowel phonemes, we developed new algorithms for improved speaker identification accuracy. Phoneme weighting methods (which performed classification based on the ideal phonemes we found from the previous experiments) and