• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

General phrase speaker verification using sub-word backgroundmodels and likelihood-ratio scoring (1996)

by S Parthasarathy, A E Rosenberg
Venue:In Spoken Language
Add To MetaCart

Tools

Sorted by:
Results 1 - 9 of 9

Robust Endpoint Detection and Energy Normalization for RealTime Speech and Speaker Recognition.

by Q Li, J Zheng, A Tsai, Q Zhou - IEEE Transactions on Speech and Audio Processing, , 2002
"... ..."
Abstract - Cited by 54 (0 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...idual EER is 2.8%. The accuracy is in the same level as the speakersLI et al.: ROBUST ENDPOINT DETECTION AND ENERGY NORMALIZATION 155 verification system where HMMs were applied to endpoint detection =-=[27]-=-, [28]. The proposed algorithm has also been implemented in a real speech controller with embedded speaker verification. Readers are referred to [3] for detail. V. CONCLUSIONS In this paper, we propos...

Automatic verbal information verification for user authentication

by Qi Li, Biing-hwang Juang, Qiru Zhou, Chin-hui Lee - IEEE Transactions on Speech and Audio Processing
"... Abstract—Traditional speaker authentication focuses on speaker verification (SV) and speaker identification, which is accomplished by matching the speaker’s voice with his or her registered speech patterns. In this paper, we propose a new technique, verbal information verification (VIV), in which sp ..."
Abstract - Cited by 21 (3 self) - Add to MetaCart
Abstract—Traditional speaker authentication focuses on speaker verification (SV) and speaker identification, which is accomplished by matching the speaker’s voice with his or her registered speech patterns. In this paper, we propose a new technique, verbal information verification (VIV), in which spoken utterances of a claimed speaker are verified against the key (usu-ally confidential) information in the speaker’s registered profile automatically to decide whether the claimed identity should be accepted or rejected. Using the proposed sequential procedure involving three question-response turns, we achieved an error-free result in a telephone speaker authentication experiment with 100 speakers. We further propose a speaker authentication system by com-bining VIV with SV. In the system, a user is verified by VIV in the first four to five accesses, usually from different acoustic environments. During these uses, one of the key questions pertains to a pass-phrase for SV. The VIV system collects and verifies the pass-phrase utterance for use as training data for speaker model construction. After a speaker-dependent model is constructed, the system then migrates to SV. This approach avoids the incon-venience of a formal enrollment procedure, ensures the quality of the training data for SV, and mitigates the mismatch caused by different acoustic environments between training and testing. Experiments showed that the proposed system improved the SV performance by over 40 % in equal-error rate compared to a conventional SV system. Index Terms—Speaker authentication, speaker recognition, speaker verification, utterance verification, verbal information verification. I.

Guidelines for experiments on the POLYCOST database

by Håkan Melin, Johan Lindberg - in Proceedings of a COST 250 workshop on Application of Speaker Recognition Techniques in Telephony , 1997
"... The purpose of this document is to define a common ground for speaker recognition experiments on the POLYCOST database. It is done by defining a set of baseline experiments for which results always should be included when presenting evaluations made on this database. By including these results and b ..."
Abstract - Cited by 14 (3 self) - Add to MetaCart
The purpose of this document is to define a common ground for speaker recognition experiments on the POLYCOST database. It is done by defining a set of baseline experiments for which results always should be included when presenting evaluations made on this database. By including these results and by presenting the differences introduced in new experiments, a comparison between systems tested on different sites is made possible. Four baseline experiments are defined: text-dependent speaker verification (SV) on fixed password sentence, text-prompted SV on digit sequence, text-independent SV on free speech in subject's mother tongue and finally text-independent speaker identification on the same free speech. The definition of the baseline experiment includes the definition of client and impostor speakers and speakers for training a world model; sessions for enrollment and test; which speech items to use and how to compute and present results. 1. Introduction The purpose of this documen...
(Show Context)

Citation Context

...e a world-model can not be trained for each existing password phrase in a system. The closest alternative would perhaps be to build a world model from subword components. This is done for instance in =-=[2]-=- where Parthasarathy & Rosenberg concludes that it is important that the world model captures the text contents of the spoken utterance. Choosing SEN01 for world-model training can be seen as the idea...

Fabbrizio, “Intelligent virtual agents for contact center automation

by Mazin Gilbert, Jay G. Wilpon, Benjamin Stern, Artville Comstock - in IEEE Signal Processing Magazine, Volume 22, Number 5 , 2005
"... [A human-machine communication system for next-generation contact centers] ..."
Abstract - Cited by 14 (1 self) - Add to MetaCart
[A human-machine communication system for next-generation contact centers]
(Show Context)

Citation Context

...ologies can be either text dependent or text independent. In text-dependent mode, the user is limited to saying a predefined set of known words. This is modeled using an HMM designed for each speaker =-=[19]-=-. In text-independent mode, the user is provided with greater flexibility to speak from an unrestricted set. Text-independent systems are modeled using a Gaussian mixture model (GMM) [18]. For both mo...

Background model design for flexible and portable speaker verification systems

by O Siohan, C H Lee, A C Surendran, Q Li - In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing , 1999
"... ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...ated utterances of a pass-phrase spoken by the customer during enrollment. This model is usually created either by concatenating phone-based customer HMMs or by directly estimating a whole-phrase HMM =-=[3].T-=-he background model is usually an HMM that reduces the need for a � (1) Input Speech SI PHONE RECOGNITION SYSTEM Password Transcription Claimed Identity Cepstral Coefficients Decoded String Phone Bo...

ABSTRACT OF THE THESIS Combining Speech Recognition and Speaker Verification

by K Mohan, Aanchan K Mohan, Professor Lawrence, R. Rabiner , 2008
"... and approved by ..."
Abstract - Add to MetaCart
and approved by
(Show Context)

Citation Context

...eristics of a speaker’s voice in a text-independent speaker identification task. Probabilistic models such as the Hidden Markov Model(HMM) [23, 24] have also been used speaker recognition tasks in 20 =-=[25, 26, 27, 28]-=-. Probability density functions consisting of a mixture of Gaussians, also called a Gaussian Mixture Model(GMM), have been studied extensively in [7, 8] for the purpose of modeling a speaker’s voice i...

Automatic Verbal Information Verification for User Authentication

by Chin-hui Lee, Qi Li, Biing-hwang Juang, Qiru Zhou, Chin-hui Lee , 2016
"... Automatic verbal information verification for ..."
Abstract - Add to MetaCart
Automatic verbal information verification for

Stockholm

by Johan Olsson, Kungliga Tekniska Högskolan
"... The aim of this report was to implement a text-dependent speaker verification system using speaker adapted neural networks and to evaluate the system. The idea was to use a hybrid HMM/ANN approach, i.e. Artificial Neural Networks were used to estimate Hidden Markov Model emission posterior probabili ..."
Abstract - Add to MetaCart
The aim of this report was to implement a text-dependent speaker verification system using speaker adapted neural networks and to evaluate the system. The idea was to use a hybrid HMM/ANN approach, i.e. Artificial Neural Networks were used to estimate Hidden Markov Model emission posterior probabilities from speech data, and the system was implemented in C++ as a module for GIVES. The report also contains an overview over speaker verification. Methods and algorithms for network training and adaptation are explained, and the performance of the system is tested. Both Multi-Layer perceptrons and Single-Layer perceptrons are tested and compared to other speaker verification systems. The test results show that the hybrid HMM/ANN system does not perform as well as other speaker verification systems, but if the system parameters are optimised further performance might increase. Along with an analysis and summary of the project possible improvements of the system are suggested. Sammanfattning Målet med denna rapport var att implementera ett textberoende talarverifieringssystem med
(Show Context)

Citation Context

...the CNET system is not a hybrid system but is based on HMM’s only and uses a fixed password for enrolment and verification. A similar ASV system was constructed by Parthasarathy and Rosenberg at AT&T =-=[21]-=-. The AT&T system also uses passwords and is based on HMM’s. Hybrid HMM/ANN systems have also been used for automatic speech recognition (ASR). The ASR task is different from the ASV task as explained...

Speech and Language Processing

by Mazin Gilbert, Junlan Feng
"... over the Web [Changing the way people communicate and access information] © IMAGESTATE Over the past decade, the World Wide Web (WWW) has been evolving into a central communication hub for consumers and businesses to efficiently access and deliver multimedia information containing text, speech, grap ..."
Abstract - Add to MetaCart
over the Web [Changing the way people communicate and access information] © IMAGESTATE Over the past decade, the World Wide Web (WWW) has been evolving into a central communication hub for consumers and businesses to efficiently access and deliver multimedia information containing text, speech, graphics, audio, or video. In this booming era of the Internet, communication is evolving at an extraordinary pace, changing from voice over traditional landline phones to multimedia data across multiple mobile devices, services, and networks. Technological breakthroughs, which are making people communicate more seamlessly and acquire information more efficiently, are revolutionizing the fields of speech and language processing and providing new research challenges and lucrative business opportunities in areas of communication, entertainment, and marketing. Figure 1 shows a sample of Web-based applications that are benefiting from the Internet revolution as well as from advances made in mobile devices. The use of multimodal user interfaces and multimedia outputs continue to play a role in the evolution of the Internet, transforming traditional business applications such as customer care and security, and promoting newer applications such as information search and mining. As the content and usage of the Web continues to grow, the need for accurate systems to locate or extract meaningful and actionable information will continue to rise. Three types of classes of systems have been evolving over the past decade. The first class includes systems capable of searching through documents using keywords. These systems, more commonly known as search engines, such as Google Search and Yahoo Search, apply advanced language processing and probabilistic methods to index words and phrases to enable rapid retrieval of documents. Search engine Digital Object Identifier 10.1109/MSP.2008.918410 IEEE SIGNAL PROCESSING MAGAZINE [18] MAY 2008 1053-5888/08/$25.00©2008IEEEperformance is rather impressive in terms of efficiency and accuracy of retrieval for top-ten candidates. Google is now able to re-index the Web daily and provide search responses in a fraction of a second. Search engines have also been applied for voice search but at a smaller scale. The most commonly used ones today are automated directory assistance
(Show Context)

Citation Context

...er experience. Webbased customer care includes search, question/answering, live chat, and e-mails. Integrating the Web with language processing can provide several new IEEE SIGNAL PROCESSING MAGAZINE =-=[26]-=- MAY 2008opportunities for customer care. Both AT&T (http://ask.att.com/esh/main/ chainAction.do) and Ikea (http://www.ikea.com/us/en) have virtual online chat agents that provide customers with a na...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University