Results 1 - 10
of
15
Confidence Measures for Large Vocabulary Continuous Speech Recognition
- IEEE Transactions on Speech and Audio Processing
, 2001
"... In this paper, we present several confidence measures for large vocabulary continuous speech recognition. We propose to estimate the confidence of a hypothesized word directly as its posterior probability, given all acoustic observations of the utterance. These probabilities are computed on word gra ..."
Abstract
-
Cited by 70 (7 self)
- Add to MetaCart
In this paper, we present several confidence measures for large vocabulary continuous speech recognition. We propose to estimate the confidence of a hypothesized word directly as its posterior probability, given all acoustic observations of the utterance. These probabilities are computed on word graphs using a forward-backward algorithm. We also study the estimation of posterior probabilities on N-best lists instead of word graphs and compare both algorithms in detail. In addition, we compare the posterior probabilities with two alternative confidence measures, i.e., the acoustic stability and the hypothesis density. We present experimental results on five different corpora: the Dutch ARISE lk evaluation corpus, the German Verbmobil '98 7k evaluation corpus, the English North American Business '94 20k and 64k development corpora, and the English Broadcast News '96 65k evaluation corpus. We show that the posterior probabilities computed on word graphs outperform all other confidence measures. The relative reduction in confidence error rate ranges between 19% and 35% compared to the baseline confidence error rate.
Confidence Estimation for Machine Translation
- IN M. ROLLINS (ED.), MENTAL IMAGERY
, 2004
"... ..."
Connectionist speech recognition of Broadcast News
, 2002
"... This paper describes connectionist techniques for recognition of Broadcast News. The fundamental difference between connectionist systems and more conventional mixture-of-Gaussian systems is that connectionist models directly estimate posterior probabilities as opposed to likelihoods. Access to post ..."
Abstract
-
Cited by 28 (10 self)
- Add to MetaCart
This paper describes connectionist techniques for recognition of Broadcast News. The fundamental difference between connectionist systems and more conventional mixture-of-Gaussian systems is that connectionist models directly estimate posterior probabilities as opposed to likelihoods. Access to posterior probabilities has enabled us to develop a number of novel approaches to confidence estimation, pronunciation modelling and search. In addition we have investigated a new feature extraction technique based on the modulation-filtered spectrogram (MSG), and methods for combining multiple information sources. We have incorporated all of these techniques into a system for the transcription
Confidence Measures for Hybrid HMM/ANN Speech Recognition
- In Proceedings of EuroSpeech
, 1997
"... In this paper we introduce four acoustic confidence measures which are derived from the output of a hybrid HMM/ANN large vocabulary continuous speech recognition system. These confidence measures, based on local posterior probability estimates computed by an ANN, are evaluated at both phone and word ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
In this paper we introduce four acoustic confidence measures which are derived from the output of a hybrid HMM/ANN large vocabulary continuous speech recognition system. These confidence measures, based on local posterior probability estimates computed by an ANN, are evaluated at both phone and word levels, using the North American Business News corpus. 1. INTRODUCTION A reliable measure of the confidence of a speech recogniser's output is useful in many circumstances. A word may be hypothesised with low confidence when an out-of-vocabulary (OOV) word is encountered or when the word model is matched against unclear acoustics caused by disfluencies or noise. Both OOV words and unclear acoustics are a major source of recogniser error. A confidence measure based on can be used to reject those hypotheses which are likely to be erroneous (i.e., have a low confidence) in a hypothesis test. Additionally, a reliable confidence measure may be of practical use in recognition search (confidence ...
Integration Of Utterance Verification With Statistical Language Modeling And Spoken Language Understanding
, 1998
"... Methods for utterance verification (UV) and their integration into statistical language modeling and spoken language understanding formalisms for a large vocabulary spoken understanding system are presented. The paper consists of three parts. First, a set of acoustic likelihood ratio based utterance ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Methods for utterance verification (UV) and their integration into statistical language modeling and spoken language understanding formalisms for a large vocabulary spoken understanding system are presented. The paper consists of three parts. First, a set of acoustic likelihood ratio based utterance verification techniques are described and applied to the problem of rejecting portions of a hypothesized word string that may have been incorrectly decoded by a large vocabulary continuous speech recognizer. Second, a procedure for integrating the acoustic level confidence measures with the statistical language model is described. Finally, the effect of integrating acoustic level confidence into the spoken language understanding unit (SLU) in a call-- type classification task is discussed. These techniques were evaluated on utterances collected from a highly unconstrained call routing task performed over the telephone network. They have been evaluated in terms of their ability to classify u...
Use of Acoustic Prior Information for Confidence Measure in ASR Applications
, 2001
"... In this paper, we propose a new acoustic confidence measure of ASR hypothesis and compare it to approaches proposed in the literature. This approach takes into account prior information on the acoustic model performance specific to each phoneme. The new method is tested on two types of recognition e ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
In this paper, we propose a new acoustic confidence measure of ASR hypothesis and compare it to approaches proposed in the literature. This approach takes into account prior information on the acoustic model performance specific to each phoneme. The new method is tested on two types of recognition errors: the out-of-vocabulary words and the errors due to additive noise. We then propose an efficient way to interpret the raw confidence measure as a correctness prior probability.
Semantic Confidence Measurement for Spoken Dialogue Systems
- IEEE Trans. on SAP
, 2005
"... Abstract—This paper proposes two methods to incorporate semantic information into word and concept level confidence measurement. The first method uses tag and extension probabilities obtained from a statistical classer and parser. The second method uses a maximum entropy based semantic structured la ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract—This paper proposes two methods to incorporate semantic information into word and concept level confidence measurement. The first method uses tag and extension probabilities obtained from a statistical classer and parser. The second method uses a maximum entropy based semantic structured language model to assign probabilities to each word. Incorporation of semantic features into a lattice posterior probability based confidence measure provides significant improvements compared to posterior probability when used together in an air travel reservation task. At 5% False Alarm (FA) rate relative improvements of 28 % and 61 % in Correct Acceptance (CA) rate are achieved for word level and concept level confidence measurements, respectively. I.
Real-time word confidence scoring using local posterior probabilities on tree trellis search
- In Proc. ICASSP,volume 1
, 2004
"... Confidence scoring based on word posterior probability is usually performed as a post process of speech recognition decoding, and also needs a large number of word hypotheses to get enough confidence quality. We propose a simple way of computing the word confidence using estimated posterior probabil ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Confidence scoring based on word posterior probability is usually performed as a post process of speech recognition decoding, and also needs a large number of word hypotheses to get enough confidence quality. We propose a simple way of computing the word confidence using estimated posterior probability while decoding. At the word expansion of stack decoding search, the local sentence likelihoods that contains heuristic scores of unreached segment are directly used to compute the posterior probabilities. Experimental result showed that, although the likelihoods are not optimal, it can provide slightly better confidence measures compared with N-best lists, while the computation is faster than 100best method because no N-best decoding is required. 1.
Confidence Estimation for Machine Translation
"... We present a detailed study of confidence estimation for machine translation. Various methods for determining whether MT output is correct are investigated, for both whole sentences and words. Since the notion of correctness is not intuitively clear in this context, different ways of defining it are ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present a detailed study of confidence estimation for machine translation. Various methods for determining whether MT output is correct are investigated, for both whole sentences and words. Since the notion of correctness is not intuitively clear in this context, different ways of defining it are proposed. We present results on data from the NIST 2003 Chinese-to-English MT evaluation. 1
CLASSIFIER DESIGN FOR VERIFICATION OF MULTI-CLASS RECOGNITION DECISION
"... This paper investigates a 2-class classifier approach with the aim of improving the word verification performance. The classifier operates on a discriminant function which is a linear combination of the smoothed likelihood ratios for the N-best candidates and the background (BG) and out-of-vocabular ..."
Abstract
- Add to MetaCart
This paper investigates a 2-class classifier approach with the aim of improving the word verification performance. The classifier operates on a discriminant function which is a linear combination of the smoothed likelihood ratios for the N-best candidates and the background (BG) and out-of-vocabulary (OOV) filler models, and is optimized using discriminative training to minimize the classification error. This paper discusses several strategies involving the likelihood ratio based formulation and the use of N-best candidates and the BG and OOV models in the classifier. In word verification experiments using a connected-digit database containing utterances recorded in a moving car with a hands-free microphone, the likelihood ratio based formulation achieved a relative error reduction of 35 % in comparison with a likelihood based formulation. In addition, we observed that the use of N-best candidates and the BG and OOV models improved the performance with a relative error reduction of roughly 10%. 1.

