Results 1 -
2 of
2
Classifiers for Synthetic Speech Detection: A Comparison
"... Automatic speaker verification (ASV) systems are highly vul-nerable against spoofing attacks, also known as imposture. With recent developments in speech synthesis and voice conversion technology, it has become important to detect synthesized or voice-converted speech for the security of ASV systems ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Automatic speaker verification (ASV) systems are highly vul-nerable against spoofing attacks, also known as imposture. With recent developments in speech synthesis and voice conversion technology, it has become important to detect synthesized or voice-converted speech for the security of ASV systems. In this paper, we compare five different classifiers used in speaker recognition to detect synthetic speech. Experimental results conducted on the ASVspoof 2015 dataset show that support vector machines with generalized linear discriminant kernel (GLDS-SVM) yield the best performance on the development set with the EER of 0.12 % whereas Gaussian mixture model (GMM) trained using maximum likelihood (ML) criterion with the EER of 3.01 % is superior for the evaluation set. Index Terms: spoof detection, countermeasures, speaker recognition
A comparison of features for synthetic speech detection
- in INTERSPEECH
, 2015
"... The performance of biometric systems based on automatic speaker recognition technology is severely degraded due to spoofing attacks with synthetic speech generated using different voice conversion (VC) and speech synthesis (SS) techniques. Various countermeasures are proposed to detect this type of ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
The performance of biometric systems based on automatic speaker recognition technology is severely degraded due to spoofing attacks with synthetic speech generated using different voice conversion (VC) and speech synthesis (SS) techniques. Various countermeasures are proposed to detect this type of at-tack, and in this context, choosing an appropriate feature extrac-tion technique for capturing relevant information from speech is an important issue. This paper presents a concise experi-mental review of different features for synthetic speech detec-tion task. A wide variety of features considered in this study include previously investigated features as well as some other potentially useful features for characterizing real and synthetic speech. The experiments are conducted on recently released ASVspoof 2015 corpus containing speech data from a large number of VC and SS technique. Comparative results using two different classifiers indicate that features representing spectral information in high-frequency region, dynamic information of speech, and detailed information related to subband characteris-tics are considerably more useful in detecting synthetic speech. Index Terms: anti-spoofing, ASVspoof 2015, feature extrac-tion, countermeasures