Results 1 -
8 of
8
Speech Recognition in Noisy Environments
- Ph. D. Dissertation, ECE Department, CMU
, 1996
"... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.1. Thesis goals . . . . . . . . . . . . . . . . . . . . . ..."
Abstract
-
Cited by 72 (3 self)
- Add to MetaCart
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.1. Thesis goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2. Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 2 The SPHINX-II Recognition System . . . . . . . . . . . . . . . . . . . . . . 17 2.1. An Overview of the SPHINX-II System . . . . . . . . . . . . . . . . . . 17 2.1.1. Signal Processing . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.2. Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . 20 2.1.3. Recognition Unit . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.1.4. Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1.5. Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2. Experimental Tasks and Corpora . ...
Error-responsive feedback mechanisms for speech recognizers
, 1997
"... This thesis is about modeling, analyzing, and predicting errorful behavior in large vocabulary continuous speech recognition systems. Because today's state-of-the-art recognizers are not designed to be situated naturally in an error feedback loop, they are ill-positioned for inclusion in multi-modal ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
This thesis is about modeling, analyzing, and predicting errorful behavior in large vocabulary continuous speech recognition systems. Because today's state-of-the-art recognizers are not designed to be situated naturally in an error feedback loop, they are ill-positioned for inclusion in multi-modal interfaces, multi-media databases, and other interesting applications. I make improvements to the current approach to predicting and analyzing error behaviors, which is currently based only on the measurement ofword error rate. The speech recognizer's functionality is extended to include con dence annotations, which are \meta-level " markings that indicate how certain the recognizer is that it has decoded its input correctly. This is accomplished by feeding externally de ned error conditions back to the recognizer. Error feedback enables the construction of statistical models that map measurements of the recognizer's internal states and behaviors to externally de ned error conditions.
Environmental Adaptation for Robust Speech Recognition
, 1994
"... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1. Approaches to Overcoming Environmental Variability . . . . . . ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1. Approaches to Overcoming Environmental Variability . . . . . . . . . . . . . . 6 1.1.1. Re-Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1.2. Multi-Style Training . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.1.3. Environmental Compensation Using Dynamic Adaptation . . . . . . . . . . 8 1.2. Towards Environment-Independent Recognition . . . . . . . . . . . . . . . . 8 1.2.1. Sources of Environmental Variability . . . . . . . . . . . . . . . . . . 9 1.2.2. Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . 9 1.3. Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Chapter 2 Overview of Environmental Robustness in Speech Recognition . . . . . . 12 2.1. Sources of Degradation...
Some Results on Search Complexity vs Accuracy
- in DARPA Speech Recognition Workshop
, 1997
"... This paper presents three different techniques applied in or developed during the 1996 Hub-4 broadcast news transcription task. First, an efficient shortest path graph search algorithm is applied to the word lattice created by Viterbi search, producing a globally optimum result. This reduces the wor ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
This paper presents three different techniques applied in or developed during the 1996 Hub-4 broadcast news transcription task. First, an efficient shortest path graph search algorithm is applied to the word lattice created by Viterbi search, producing a globally optimum result. This reduces the word error rate by about 3-10% (relative), depending on the test set. The execution time is at or close to real time for most utterances. Second, a segmented N-best list generation algorithm is described for producing compact N-best lists for very long utterances. Finally, a temporal smoothing technique is compared to deleted interpolation. On one test set, temporal smoothing reduces the error rate by 3% for an 8% increase in search cost, while the latter improves by 6% for a 50% increase in search cost. 1. Introduction In this paper we describe the results of a number of search experiments on the 1996 Hub-4 development and evaluation test sets. We have also attempted to document issues that a...
Acoustic-Feature-Based Frequency Warping for Speaker Normalization
, 1998
"... xi Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Chapter 1 ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
xi Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Chapter 1
Some Results on Search Complexity vs Accuracy
"... This paper presents three different techniques applied in or developed during the 1996 Hub-4 broadcastnews transcription task. First, an efficient shortest path graph search algorithm is applied to the word lattice created by Viterbi search, producing a globally optimum result. This reduces the word ..."
Abstract
- Add to MetaCart
This paper presents three different techniques applied in or developed during the 1996 Hub-4 broadcastnews transcription task. First, an efficient shortest path graph search algorithm is applied to the word lattice created by Viterbi search, producing a globally optimum result. This reduces the word error rate by about 3-10 % (relative), depending on the test set. The execution time is at or close to real time for most utterances. Second, a segmented N-best list generation algorithm is described for producing compact N-best lists for very long utterances. Finally, a temporal smoothing technique is compared to deleted interpolation. On one test set, temporal smoothing reduces the error rate by 3 % for an 8 % increase in search cost, while the latter improves by 6 % for a 50 % increase in search cost. 1.
Some Results on Search Complexity vs Accuracy
, 1997
"... This paper presents three different techniques applied in or developed during the 1996 Hub-4 broadcastnews transcription task. First, an efficient shortest path graph search algorithm is applied to the word lattice created by Viterbi search, producing a globally optimum result. This reduces the word ..."
Abstract
- Add to MetaCart
This paper presents three different techniques applied in or developed during the 1996 Hub-4 broadcastnews transcription task. First, an efficient shortest path graph search algorithm is applied to the word lattice created by Viterbi search, producing a globally optimum result. This reduces the word error rate by about 3-10% (relative), depending on the test set. The execution time is at or close to real time for most utterances. Second, a segmented N-best list generation algorithm is described for producing compact N-best lists for very long utterances. Finally, a temporal smoothing technique is compared to deleted interpolation. On one test set, temporal smoothing reduces the error rate by 3% for an 8% increase in search cost, while the latter improves by 6% for a 50% increase in search cost. 1. Introduction In this paper we describe the results of a number of search experiments on the 1996 Hub-4 development and evaluation test sets. We have also attempted to document issues that a...

