Results 1 -
4 of
4
AuthLoop: End-to-End Cryptographic Authentication for Telephony over Voice Channels AuthLoop: Practical End-to-End Cryptographic Authentication for Telephony over Voice Channels
, 2016
"... Abstract Telephones remain a trusted platform for conducting some of our most sensitive exchanges. From banking to taxes, wide swathes of industry and government rely on telephony as a secure fall-back when attempting to confirm the veracity of a transaction. In spite of this, authentication is poo ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract Telephones remain a trusted platform for conducting some of our most sensitive exchanges. From banking to taxes, wide swathes of industry and government rely on telephony as a secure fall-back when attempting to confirm the veracity of a transaction. In spite of this, authentication is poorly managed between these systems, and in the general case it is impossible to be certain of the identity (i.e., Caller ID) of the entity at the other end of a call. We address this problem with AuthLoop, the first system to provide cryptographic authentication solely within the voice channel. We design, implement and characterize the performance of an in-band modem for executing a TLS-inspired authentication protocol, and demonstrate its abilities to ensure that the explicit single-sided authentication procedures pervading the web are also possible on all phones. We show experimentally that this protocol can be executed with minimal computational overhead and only a few seconds of user time (≈ 9 instead of ≈ 97 seconds for a naïve implementation of TLS 1.2) over heterogeneous networks. In so doing, we demonstrate that strong end-to-end validation of Caller ID is indeed practical for all telephony networks.
I Know That Voice: Identifying the Voice Actor Behind the Voice
"... Intentional voice modifications by electronic or non-electronic means challenge automatic speaker recognition systems. Previous work focused on detecting the act of disguise or identifying everyday speakers disguising their voices. Here, we propose a benchmark for the study of voice disguise, by stu ..."
Abstract
- Add to MetaCart
(Show Context)
Intentional voice modifications by electronic or non-electronic means challenge automatic speaker recognition systems. Previous work focused on detecting the act of disguise or identifying everyday speakers disguising their voices. Here, we propose a benchmark for the study of voice disguise, by studying the voice variability of profes-sional voice actors. A dataset of 114 actors playing 647 characters is created. It contains 19 hours of captured speech, divided into 29,733 utterances tagged by charac-ter and actor names, which is then further sampled. Text-independent speaker identification of the actors based on a novel benchmark training on a subset of the characters they play, while testing on new unseen characters, shows an EER of 17.1%, HTER of 15.9%, and rank-1 recognition rate of 63.5 % per utterance when training a Convolutional Neural Network on spectrograms generated from the utter-ances. An I-Vector based system was trained and tested on the same data, resulting in 39.7 % EER, 39.4 % HTER, and rank-1 recognition rate of 13.6%. 1.
Available online at www.sciencedirect.com
, 2015
"... technology is vulnerability of the recognizers to intentional circumvention (Wu et al., 2015). In the first case, authenti-cation, this refers to dedicated effort to manipulate one’s speech so that an ASV system would misclassify the attack-er’s sample to originate from the target (client). There ar ..."
Abstract
- Add to MetaCart
technology is vulnerability of the recognizers to intentional circumvention (Wu et al., 2015). In the first case, authenti-cation, this refers to dedicated effort to manipulate one’s speech so that an ASV system would misclassify the attack-er’s sample to originate from the target (client). There are ⇑ Corresponding author.
Automatic versus Human Speaker Verification: The Case of Voice Mimicry
"... In this work, we compare the performance of three modern speaker verification systems and non-expert human listeners in the presence of voice mimicry. Our goal is to gain insights on how vulnerable speaker verification systems are to mimicry attack and compare it to the performance of human listener ..."
Abstract
- Add to MetaCart
(Show Context)
In this work, we compare the performance of three modern speaker verification systems and non-expert human listeners in the presence of voice mimicry. Our goal is to gain insights on how vulnerable speaker verification systems are to mimicry attack and compare it to the performance of human listeners. We study both traditional Gaussian mixture model-universal background model (GMM-UBM) and an i-vector based classifier with cosine scoring and probabilistic linear discriminant analysis (PLDA) scoring. For the studied material in Finnish language, the mimicry attack decreased lightly the equal error rate (EER) for GMM-UBM from 10.83 to 10.31, while for i-vector systems the EER increased from 6.80 to 13.76 and from 4.36 to 7.38. The performance of the human listening panel shows that imitated speech increases the difficulty of the speaker verification task. It is even more difficult to recognize a person who is intentionally concealing his or her identity. For Impersonator A, the average listener made 8 errors from 34 trials while the automatic systems had 6 errors in the same set. The average listener for Impersonator B made 7 errors from the 28 trials, while the automatic systems made 7 to 9 errors. A statistical analysis of the listener performance was also conducted. We found out a statistically significant association, with p = 0.00019 and R2 = 0.59, between listener accuracy and self reported factors only when familiar voices were present in the test.