Results 1 - 10
of
14
ESTIMATING CONFIDENCE USING WORD LATTICES
"... For many practical applications of speech recognition systems, it is desirable to have an estimate of con dence for each hypothesized word, i.e. to have an estimate which words of the speech recognizer's output are likely to be correct and which are not reliable. Many oftoday's speech recognition sy ..."
Abstract
-
Cited by 52 (3 self)
- Add to MetaCart
For many practical applications of speech recognition systems, it is desirable to have an estimate of con dence for each hypothesized word, i.e. to have an estimate which words of the speech recognizer's output are likely to be correct and which are not reliable. Many oftoday's speech recognition systems use word lattices as a compact representation of a set of alternative hypothesis. We exploit the use of such word lattices as information sources for the measure-of-con dence tagger JANKA [1]. In experiments on spontaneous human-to-human speech data the use of word lattice related information signi cantly improves the tagging accuracy.
Modeling Out-Of-Vocabulary Words For Robust Speech Recognition
, 2000
"... This thesis concerns the problem of unknown or out-of-vocabulary (00V) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognize ..."
Abstract
-
Cited by 43 (5 self)
- Add to MetaCart
This thesis concerns the problem of unknown or out-of-vocabulary (00V) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognizer erroneously substitutes the OOV word with a similarly sounding word from its vocabulary. Furthermore, a recognition error due to an OOV word tends to spread errors into neighboring words; dramatically degrading overall recognition performance.
Recognition confidence scoring and its use in speech understanding systems
- Computer Speech and Language
, 2002
"... In this paper we present an approach to recognition confidence scoring and a method for integrating confidence scores into the understanding and dialogue components of a speech understanding system. The system uses a multi-tiered approach where confidence scores are computed at the phonetic, word, a ..."
Abstract
-
Cited by 42 (4 self)
- Add to MetaCart
In this paper we present an approach to recognition confidence scoring and a method for integrating confidence scores into the understanding and dialogue components of a speech understanding system. The system uses a multi-tiered approach where confidence scores are computed at the phonetic, word, and utterance levels. The scores are produced by extracting confidence features from the computation of the recognition hypotheses and processing these features using an accept/reject classifier for word and utterance hypotheses. The output of the confidence classifiers can then be incorporated into the parsing mechanism of the language understanding component. To evaluate the system, experiments were conducted using the JUPITER weather information system. Evaluation was performed at the understanding level using key-value pair concept error rate as the evaluation metric. When confidence scores were integrated into the understanding component of the system, the concept error rate was reduced by over 35%.
Confidence Measures for Hybrid HMM/ANN Speech Recognition
- In Proceedings of EuroSpeech
, 1997
"... In this paper we introduce four acoustic confidence measures which are derived from the output of a hybrid HMM/ANN large vocabulary continuous speech recognition system. These confidence measures, based on local posterior probability estimates computed by an ANN, are evaluated at both phone and word ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
In this paper we introduce four acoustic confidence measures which are derived from the output of a hybrid HMM/ANN large vocabulary continuous speech recognition system. These confidence measures, based on local posterior probability estimates computed by an ANN, are evaluated at both phone and word levels, using the North American Business News corpus. 1. INTRODUCTION A reliable measure of the confidence of a speech recogniser's output is useful in many circumstances. A word may be hypothesised with low confidence when an out-of-vocabulary (OOV) word is encountered or when the word model is matched against unclear acoustics caused by disfluencies or noise. Both OOV words and unclear acoustics are a major source of recogniser error. A confidence measure based on can be used to reject those hypotheses which are likely to be erroneous (i.e., have a low confidence) in a hypothesis test. Additionally, a reliable confidence measure may be of practical use in recognition search (confidence ...
Confidence Measures From Local Posterior Probability Estimates
- Computer Speech and Language
, 1999
"... In this paper we introduce a set of related confidence measures for large vocabulary continuous speech recognition (LVCSR) based on local phone posterior probability estimates output by an acceptor HMM acoustic model. In addition to their computational efficiency, these confidence measures are attra ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
In this paper we introduce a set of related confidence measures for large vocabulary continuous speech recognition (LVCSR) based on local phone posterior probability estimates output by an acceptor HMM acoustic model. In addition to their computational efficiency, these confidence measures are attractive as they may be applied at the state-, phone-, word- or utterance-levels, potentially enabling discrimination between different causes of low confidence recognizer output, such as unclear acoustics or mismatched pronunciation models. We have evaluated these confidence measures for utterance verification using a number of different metrics. Experiments reveal several trends in `profitability of rejection', as measured by the unconditional error rate of a hypothesis test. These trends suggest that crude pronunciation models can mask the relatively subtle reductions in confidence caused by out-of-vocabulary (OOV) words and disfluencies, but not the gross model mismatches elicited by non-sp...
Integration of Continuous Speech Recognition and Information Retrieval for Mutually Optimal Performance
- COMPUTER SCIENCE DEPARTMENT, CARNEGIE MELLON UNIVERSITY. HTTP://WWW.CS.CMU.EDU/~MSIEGLER/PUBLISH/PHD/THESIS.PS.GZ SINGHAL
, 1999
"... Traditionally, indexing and searching of speech content in multimedia databases have been achieved through a combination of separately constructed speech recognition and information retrieval engines. Although each technology has a legacy of research, only recently have efforts been made to study th ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Traditionally, indexing and searching of speech content in multimedia databases have been achieved through a combination of separately constructed speech recognition and information retrieval engines. Although each technology has a legacy of research, only recently have efforts been made to study the potential suboptimality of this strategy, and none of these efforts specifically addresses the presence of uncertainty in automatically generated transcriptions. This research develops a refinement of the most common information retrieval relevance formula, TFIDF, to incorporate uncertainty as a retrieval feature, along with a set of techniques to acquire this uncertainty from multiple hypotheses produced by existing speech recognition data structures. In the process a greater amount of evidence is extracted than is available in the most likely transcription hypothesis, and overall retrieval precision and recall are improved. The term weighting scheme known as the inverse document frequenc...
A Boosting Approach for Confidence Scoring
, 2001
"... In this paper we present the application of a boosting classification algorithm to confidence scoring. We derive feature vectors from speech recognition lattices and feed them into a boosting classifier. This classifier combines hundreds of very simple `weak learners' and derives classification rule ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
In this paper we present the application of a boosting classification algorithm to confidence scoring. We derive feature vectors from speech recognition lattices and feed them into a boosting classifier. This classifier combines hundreds of very simple `weak learners' and derives classification rules that can reduce the confidence error rate by up to 34%. We compare our results to those obtained using two other standard classification techniques, Support Vector Machines (SVMs) and Classification and Regression Trees (CART), and show significant improvements. Furthermore, the nature of the boosting algorithm allows us to combine the best single classifier and improve its performance.
Désilets A: Semantic similarity for detecting recognition errors in automatic speech transcripts
- In: Proceedings of EMNLP. Association for Computational Linguistics
, 2005
"... Browsing through large volumes of spoken audio is known to be a challenging task for end users. One way to alleviate this problem is to allow users to gist a spoken audio document by glancing over a transcript generated through Automatic Speech Recognition. Unfortunately, such transcripts typically ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Browsing through large volumes of spoken audio is known to be a challenging task for end users. One way to alleviate this problem is to allow users to gist a spoken audio document by glancing over a transcript generated through Automatic Speech Recognition. Unfortunately, such transcripts typically contain many recognition errors which are highly distracting and make gisting more difficult. In this paper we present an approach that detects recognition errors by identifying words which are semantic outliers with respect to other words in the transcript. We describe several variants of this approach. We investigate a wide range of evaluation measures and we show that we can significantly reduce the number of errors in content words, with the trade-off of losing some good content words. 1
Confidence and Rejection in Automatic Speech Recognition
, 1997
"... : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xiii 1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.1 Research Goals : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.2 Male/Female Versus Last Na ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xiii 1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.1 Research Goals : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.2 Male/Female Versus Last Names : : : : : : : : : : : : : : : : : : : : : : : : 2 1.3 Scaling Up: 58 Phrases : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.4 Vocabulary Independence : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 1.5 Thesis Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 1.6 Tutorial on Automatic Speech Recognition : : : : : : : : : : : : : : : : : : : 7 1.6.1 A Setting for Automatic Speech Recognition : : : : : : : : : : : : : 7 1.6.2 Overview of Speech Recognition : : : : : : : : : : : : : : : : : : : : 8 1.6.3 Artificial Neural Network : : : : : : : : : : : : : : : : : : : : : : : : 12 1.6.4 Context-Dependent Modeling : : : : : : : : : : : : : ...
A Probabilistic Model for Fast and Confident Categorisation of Textual Documents
- Survey of Text Mining: Clustering, Classification, and Retrieval, Second Edition
, 2007
"... Detection/Text Mining competition organised at the Text Mining Workshop 2007. This entry relies on a straightforward implementation of a probabilistic categoriser described earlier [4]. This categoriser is adapted to handle multiple labelling and a piecewise-linear confidence estimation layer is add ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Detection/Text Mining competition organised at the Text Mining Workshop 2007. This entry relies on a straightforward implementation of a probabilistic categoriser described earlier [4]. This categoriser is adapted to handle multiple labelling and a piecewise-linear confidence estimation layer is added to provide an estimate of the labelling confidence. This technique achieves a score of 1.689 on the test data. This model has potentially useful features and extensions such as the use of a categoryspecific

