Results 1 -
8 of
8
A Maximum Entropy Approach to Adaptive Statistical Language Modeling
- Computer, Speech and Language
, 1996
"... An adaptive statistical languagemodel is described, which successfullyintegrates long distancelinguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's histor ..."
Abstract
-
Cited by 201 (11 self)
- Add to MetaCart
An adaptive statistical languagemodel is described, which successfullyintegrates long distancelinguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's history, we propose and use trigger pairs as the basic information bearing elements. This allows the model to adapt its expectations to the topic of discourse. Next, statistical evidence from multiple sources must be combined. Traditionally, linear interpolation and its variants have been used, but these are shown here to be seriously deficient. Instead, we apply the principle of Maximum Entropy (ME). Each information source gives rise to a set of constraints, to be imposed on the combined estimate. The intersection of these constraints is the set of probability functions which are consistent with all the information sources. The function with the highest entropy within that set is the ME solution...
Optimizing Lexical and N-gram Coverage Via Judicious Use of Linguistic Data
- In Proc. European Conf. on Speech Technology
"... I study the effect of various types and amounts of North American Business language data on the quality of the derived vocabulary, and use my findings to derive an improved ranking of the words, using only 19% of the NAB corpus. I then study the conflicting effects of increased vocabulary size on a ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
I study the effect of various types and amounts of North American Business language data on the quality of the derived vocabulary, and use my findings to derive an improved ranking of the words, using only 19% of the NAB corpus. I then study the conflicting effects of increased vocabulary size on a speech recognizer's accuracy, and use the result to pick an optimal vocabulary size. A similar analysis of ngram coverage yields a very different outcome, with the best system being the one based on the most data. 1. Vocabulary Optimization 1.1. OOV curve minimization Since Out-Of-Vocabulary (OOV) rate directly affects Word Error Rate, with every OOV word in the test data resulting in at least one (and often more) recognition errors, I set out to minimize the expected OOV rate of the test data. More generally, my goal was to understand how availability of various types and amounts of training data, from various time periods, affects the quality of the derived vocabulary 1 . Given a colle...
Improving Trigram Language Modeling with The World Wide Web
- Acoustics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP’01
, 2001
"... We propose a novel method for using the World Wide Web to acquire trigram estimates for statistical language modeling. We submit an N-gram as a phrase query to web search engines. The search engines return the number of web pages containing the phrase, from which the N-gram count is estimated. The N ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
We propose a novel method for using the World Wide Web to acquire trigram estimates for statistical language modeling. We submit an N-gram as a phrase query to web search engines. The search engines return the number of web pages containing the phrase, from which the N-gram count is estimated. The N-gram counts are then used to form web-based trigram probability estimates. We discuss the properties of such estimates, and methods to interpolate them with traditional corpus based trigram estimates. We show that the interpolated models improve speech recognition word error rate significantly over a small test set. 1.
A Senone Based Confidence Measure For Speech Recognition
- In Proc. of Eurospeech, Rhodes
, 1997
"... This paper describes three experiments in using frame level observation probabilities as the basis for word confidence annotation in an HMM speech recognition system. One experiment is at the word level, one uses word classes, and the other uses phone classes. In each experiment we categorize hypoth ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper describes three experiments in using frame level observation probabilities as the basis for word confidence annotation in an HMM speech recognition system. One experiment is at the word level, one uses word classes, and the other uses phone classes. In each experiment we categorize hypotheses into correct and incorrect categories by aligning a best recognition hypothesis with the known transcript. The confidence of error prediction for each class is a measure of the resolvability between the correct and incorrect histograms. 1. INTRODUCTION Speech recognition systems generally rank order hypotheses by computing scores for utterance hypotheses. These scores are useful for preference ordering the hypotheses, but do not give a good indication of the quality of the recognition or how confident the system is that the decoding is correct. For applications to act on speech input, they must be able to assess the confidence that the input has been decoded correctly. This work combi...
Confidence and Rejection in Automatic Speech Recognition
, 1997
"... : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xiii 1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.1 Research Goals : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.2 Male/Female Versus Last Na ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xiii 1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.1 Research Goals : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.2 Male/Female Versus Last Names : : : : : : : : : : : : : : : : : : : : : : : : 2 1.3 Scaling Up: 58 Phrases : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.4 Vocabulary Independence : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 1.5 Thesis Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 1.6 Tutorial on Automatic Speech Recognition : : : : : : : : : : : : : : : : : : : 7 1.6.1 A Setting for Automatic Speech Recognition : : : : : : : : : : : : : 7 1.6.2 Overview of Speech Recognition : : : : : : : : : : : : : : : : : : : : 8 1.6.3 Artificial Neural Network : : : : : : : : : : : : : : : : : : : : : : : : 12 1.6.4 Context-Dependent Modeling : : : : : : : : : : : : : ...
Confidence Metrics Based On N-Gram Language Model Backoff Behaviors
- in Proc. EUROSPEECH
, 1997
"... We report results from using language model confidence measures based on the degree of backoff used in a trigram language model. Both utterance-level and wordlevel confidence metrics proved useful for a dialog manager to identify out-of-domain utterances. The metric assigns successively lower confid ..."
Abstract
- Add to MetaCart
We report results from using language model confidence measures based on the degree of backoff used in a trigram language model. Both utterance-level and wordlevel confidence metrics proved useful for a dialog manager to identify out-of-domain utterances. The metric assigns successively lower confidence as the language model estimate is backed off to a bigram or unigram. It also bases its estimates on sequences of backoff degree. Experimental results with utterances from the domain of medical records management showed that the distributions of the confidence metric for in-domain and out-of-domain utterances are separated. Use of the corresponding word-level confidence metric shows similar encouraging results. 1. INTRODUCTION Speech recognition systems typically produce a rank ordered set of hypotheses using acoustic and language models. When designing a Spoken Language System, it is important to be able to estimate confidence that an utterance has been understood correctly by the syst...
Modeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode
- Hopkins University
, 1997
"... This paper describes the research efforts of the "Hidden Speaking Mode" group participating in the 1996 summer workshop on speech recognition. The goal of this project is to model pronunciation variations that occur in conversational speech in general and, more specifically, to investigate the use o ..."
Abstract
- Add to MetaCart
This paper describes the research efforts of the "Hidden Speaking Mode" group participating in the 1996 summer workshop on speech recognition. The goal of this project is to model pronunciation variations that occur in conversational speech in general and, more specifically, to investigate the use of a hidden speaking mode to represent systematic variations that are correlated with the word sequence (e.g. predictable from syntactic structure). This paper describes the theoretical formulation of hidden mode modeling, as well as some results in error analysis, language modeling and pronunciation 1.
Optimizing Lexical and N-gram Coverage
"... I study the effect of various types and amounts of North American Business language data on the quality of the derived vocabulary, and use my findings to derive an improved ranking of the words, using only 19% of the NAB corpus. I then study the conflicting effects of increased vocabulary size on a ..."
Abstract
- Add to MetaCart
I study the effect of various types and amounts of North American Business language data on the quality of the derived vocabulary, and use my findings to derive an improved ranking of the words, using only 19% of the NAB corpus. I then study the conflicting effects of increased vocabulary size on a speech recognizer's accuracy, and use the result to pick an optimal vocabulary size. A similar analysis of ngram coverage yields a very different outcome, with the best system being the one based on the most data.

