Results 1 -
2 of
2
Error-responsive feedback mechanisms for speech recognizers
, 1997
"... This thesis is about modeling, analyzing, and predicting errorful behavior in large vocabulary continuous speech recognition systems. Because today's state-of-the-art recognizers are not designed to be situated naturally in an error feedback loop, they are ill-positioned for inclusion in multi-modal ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
This thesis is about modeling, analyzing, and predicting errorful behavior in large vocabulary continuous speech recognition systems. Because today's state-of-the-art recognizers are not designed to be situated naturally in an error feedback loop, they are ill-positioned for inclusion in multi-modal interfaces, multi-media databases, and other interesting applications. I make improvements to the current approach to predicting and analyzing error behaviors, which is currently based only on the measurement ofword error rate. The speech recognizer's functionality is extended to include con dence annotations, which are \meta-level " markings that indicate how certain the recognizer is that it has decoded its input correctly. This is accomplished by feeding externally de ned error conditions back to the recognizer. Error feedback enables the construction of statistical models that map measurements of the recognizer's internal states and behaviors to externally de ned error conditions.
Word And Acoustic Confidence Annotation For Large Vocabulary Speech Recognition
"... We present improvements in confidence annotation of automatic speech recognizer output for large vocabulary, speakerindependent systems. Several strong additions to the set of predictor variables used for this purpose are discussed. Extensions which allow prediction of separate types of errors, as o ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
We present improvements in confidence annotation of automatic speech recognizer output for large vocabulary, speakerindependent systems. Several strong additions to the set of predictor variables used for this purpose are discussed. Extensions which allow prediction of separate types of errors, as opposed to the simple presence of an error, are presented. A new development, acoustic confidence annotation, is explored, in which a predictor is built that indicates the likely successes and failures of the acoustic models alone. Four separate learning mechanisms are compared in terms of their ability to provide good confidence annotations from the same set of predictor variables. Performance figures are reported on both read news (the North American Business news corpus) and conversational telephone speech (the Switchboard corpus) , both in American English. The Sphinx-II system [1] is used for the NAB tests. The Janussystem [2] is used for the Switchboard tests. 1. Annotation of Read Spe...

