• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Error-Responsive Feedback Mechanisms for Speech Recognizers (1997)

by L L Chase
Add To MetaCart

Tools

Sorted by:
Results 11 - 20 of 34
Next 10 →

Integrating Multiple Knowledge Sources for Utterance-Level Confidence Annotation

by Dan Bohus, Alex Rudnicky - Carnegie Mellon University , 2002
"... In the recent years, automated speech recognition has been the main drive behind the advent of spoken language interfaces, but at the same time a severe limiting factor in the development of these systems. We believe that increased robustness in the face of recognition errors can be achieved by maki ..."
Abstract - Cited by 6 (1 self) - Add to MetaCart
In the recent years, automated speech recognition has been the main drive behind the advent of spoken language interfaces, but at the same time a severe limiting factor in the development of these systems. We believe that increased robustness in the face of recognition errors can be achieved by making the systems aware of their own misunderstandings, and employing appropriate recovery techniques when breakdowns in interaction occur. In this paper we address the first problem: the development of an utterance-level confidence annotator for a spoken dialog system. After a brief introduction to the CMU Communicator spoken dialog system (which provided the target platform for the developed annotator), we cast the confidence annotation problem as a machine learning classification task, and focus on selecting relevant features and on empirically identifying the best classification techniques for this task. The results indicate that significant reductions in classification error rate can be obtained using several different classifiers. Furthermore, we propose a data driven approach to assessing the impact of the errors committed by the confidence annotator on dialog performance, with a view to optimally fine-tuning the annotator. Several

Using chunk based partial parsing of spontaneous speech in unrestricted domains for reducing word error rate in speech recognition

by Klaus Zechner, Alex Waibel - In Proceedings of COLING-ACL 98 , 1998
"... In this paper, we present achunk based partial parsing system for spontaneous, conversational speech in unrestricted domains. We show that the chunk parses produced by this parsing system can be usefully applied to the task of reranking Nbest lists from a speech recognizer, using a combination of ch ..."
Abstract - Cited by 6 (1 self) - Add to MetaCart
In this paper, we present achunk based partial parsing system for spontaneous, conversational speech in unrestricted domains. We show that the chunk parses produced by this parsing system can be usefully applied to the task of reranking Nbest lists from a speech recognizer, using a combination of chunk-based n-gram model scores and chunk coverage scores. The input for the system is Nbest lists generated from speech recognizer lattices. The hypotheses from the Nbest lists are tagged for part of speech, \cleaned up " by a preprocessing pipe, parsed by a part of speech based chunk parser, and rescored using a backpropagation neural net trained on the chunk based scores. Finally, the reranked Nbest lists are generated. The results of a system evaluation are promising in that a chunk accuracy of 87.4 % is achieved and the best performance on a randomly selected test set is a decrease in word error rate of 0.3 percent (absolute), measured on the new rst hypotheses in the reranked Nbest lists. 1

Semantic Confidence Measurement for Spoken Dialogue Systems

by Ruhi Sarikaya, Yuqing Gao, Michael Picheny, Hakan Erdogan - IEEE Trans. on SAP , 2005
"... Abstract—This paper proposes two methods to incorporate semantic information into word and concept level confidence measurement. The first method uses tag and extension probabilities obtained from a statistical classer and parser. The second method uses a maximum entropy based semantic structured la ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
Abstract—This paper proposes two methods to incorporate semantic information into word and concept level confidence measurement. The first method uses tag and extension probabilities obtained from a statistical classer and parser. The second method uses a maximum entropy based semantic structured language model to assign probabilities to each word. Incorporation of semantic features into a lattice posterior probability based confidence measure provides significant improvements compared to posterior probability when used together in an air travel reservation task. At 5% False Alarm (FA) rate relative improvements of 28 % and 61 % in Correct Acceptance (CA) rate are achieved for word level and concept level confidence measurements, respectively. I.

Confidence measures for dialogue management in the CU communicator system

by Rub San-segundo, Jose M. Pardo - ICSLP , 2000
"... This paper provides improved confidence assessment for detection of word-level speech recognition errors and out-of-domain user requests using language model features. We consider a combined measure of confidence that utilizes the language model back-off sequence, language model score, and phonetic ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
This paper provides improved confidence assessment for detection of word-level speech recognition errors and out-of-domain user requests using language model features. We consider a combined measure of confidence that utilizes the language model back-off sequence, language model score, and phonetic length of recognized words as indicators of speech recognition confidence. The paper investigates the ability of each feature to detect speech recognition errors and out-of-domain utterances as well as two methods for combining the features contextually: a multi-layer perceptron and a statistical decision tree. We illustrate the effectiveness of the algorithm by considering utterances from the ATlS airline information task as either in-domain and out-of-domain for the DARPA Communicator task. Using this hand-labeled data, it is shown that 27.9 % of incorrectly recognized words and 36.4 % of out-of-domain phrases are detected at a 2.5 % false alarm rate. 1.

Efficient Use of the Grammar Scale Factor to Classify Incorrect Words in Speech Recognition Verification

by Alberto Sanchis, Víctor Jimenez, Enrique Vidal , 2000
"... The goal of verification in speech recognition systems is to detect words in the hypothesized sentence that are likely to have been missrecognized. This decision can be based on the persistence of the different words in the output of the speech recognizer when some recognition parameter is varied. T ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
The goal of verification in speech recognition systems is to detect words in the hypothesized sentence that are likely to have been missrecognized. This decision can be based on the persistence of the different words in the output of the speech recognizer when some recognition parameter is varied. To this end, a parameter that proves particularly adequate is the so called Grammar Scale Factor (which balances acoustic and language model scores). The main disadvantage of this method is that it needs repeating the recognition process many times. In this paper, after formulating it as a Statistical Pattern Classification problem, we show how to speed-up this method, so that less than two average repetitions of the recognition process are enough to achieve essentially the same verification performance as with the many more repetitions needed by the original proposal.

Confidence estimation, oov detection and language id using phone-to-word transduction and phone-level alignments

by Christopher White, Geoffrey Zweig, Lukas Burget, Petr Schwarz, Hynek Hermansky - in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2008
"... Automatic Speech Recognition (ASR) systems continue to make errors during search when handling various phenomena including noise, pronunciation variation, and out of vocabulary (OOV) words. Predicting the probability that a word is incorrect can prevent the error from propagating and perhaps allow t ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Automatic Speech Recognition (ASR) systems continue to make errors during search when handling various phenomena including noise, pronunciation variation, and out of vocabulary (OOV) words. Predicting the probability that a word is incorrect can prevent the error from propagating and perhaps allow the system to recover. This paper addresses the problem of detecting errors and OOVs for read Wall Street Journal speech when the word error rate (WER) is very low. It augments a traditional confidence estimate by introducing two novel methods: phone-level comparison using Multi-String Alignment (MSA) and word-level comparison using phone-to-word transduction. We show that features from phone and word string comparisons can be added to a standard maximum entropy framework thereby substantially improving performance in detecting both errors and OOVs. Additionally we show an extension to detecting English and accented English for the Language Identification (LID) task.

Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices

by Graeme Blackwood, Adrià De Gispert, William Byrne
"... A novel and robust approach to improving statistical machine translation fluency is developed within a minimum Bayesrisk decoding framework. By segmenting translation lattices according to confidence measures over the maximum likelihood translation hypothesis we are able to focus on regions with pot ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
A novel and robust approach to improving statistical machine translation fluency is developed within a minimum Bayesrisk decoding framework. By segmenting translation lattices according to confidence measures over the maximum likelihood translation hypothesis we are able to focus on regions with potential translation errors. Hypothesis space constraints based on monolingual coverage are applied to the low confidence regions to improve overall translation fluency. 1

Error Detection Using Linguistic Features

by Yongmei Shi
"... Recognition errors hinder the proliferation of speech recognition (SR) systems. Based on the observation that recognition errors may result in ungrammatical sentences, especially in dictation application where an acceptable level of accuracy of generated documents is indispensable, we propose to inc ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Recognition errors hinder the proliferation of speech recognition (SR) systems. Based on the observation that recognition errors may result in ungrammatical sentences, especially in dictation application where an acceptable level of accuracy of generated documents is indispensable, we propose to incorporate two kinds of linguistic features into error detection: lexical features of words, and syntactic features from a robust lexicalized parser. Transformation-based learning is chosen to predict recognition errors by integrating word confidence scores with linguistic features. The experimental results on a dictation data corpus show that linguistic features alone are not as useful as word confidence scores in detecting errors. However, linguistic features provide complementary information when combined with word confidence scores, which collectively reduce the classification error rate by 12.30 % and improve the F measure by 53.62%. 1

A Senone Based Confidence Measure For Speech Recognition

by Z. Bergen, W. Ward - In Proc. of Eurospeech, Rhodes , 1997
"... This paper describes three experiments in using frame level observation probabilities as the basis for word confidence annotation in an HMM speech recognition system. One experiment is at the word level, one uses word classes, and the other uses phone classes. In each experiment we categorize hypoth ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
This paper describes three experiments in using frame level observation probabilities as the basis for word confidence annotation in an HMM speech recognition system. One experiment is at the word level, one uses word classes, and the other uses phone classes. In each experiment we categorize hypotheses into correct and incorrect categories by aligning a best recognition hypothesis with the known transcript. The confidence of error prediction for each class is a measure of the resolvability between the correct and incorrect histograms. 1. INTRODUCTION Speech recognition systems generally rank order hypotheses by computing scores for utterance hypotheses. These scores are useful for preference ordering the hypotheses, but do not give a good indication of the quality of the recognition or how confident the system is that the decoding is correct. For applications to act on speech input, they must be able to assess the confidence that the input has been decoded correctly. This work combi...

A Second Opinion Approach For Speech Recognition Verification

by Gustavo Hernández Ábrego, José B. Mariño
"... In order to improve the reliability of speech recognition results, a verifying system, that takes profit of the information given from an alternative recognition step is proposed. The alternative results are considered as a second opinion about the nature of the speech recognition process. Some feat ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
In order to improve the reliability of speech recognition results, a verifying system, that takes profit of the information given from an alternative recognition step is proposed. The alternative results are considered as a second opinion about the nature of the speech recognition process. Some features are extracted from both opinion sources and compiled, through a fuzzy inference system, into a more discriminant confidence measure able to verify correct results and disregard wrong ones. This approach is tested in a keyword spotting task taken form the Spanish SpeechDat database. Results show a considerable reduction of false rejections at a fixed false alarm rate compared to baseline systems.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University