Results 1 -
4 of
4
Towards Better Integration Of Semantic Predictors In Statistical Language Modeling
- In Proceedings of ICSLP-98
, 1998
"... We introduce a number of techniques designed to help integrate semantic knowledge with N-gram language models for automatic speech recognition. Our techniques allow us to integrate Latent Semantic Analysis (LSA), a word-similarity algorithm based on word co-occurrence information, with N-gram models ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
We introduce a number of techniques designed to help integrate semantic knowledge with N-gram language models for automatic speech recognition. Our techniques allow us to integrate Latent Semantic Analysis (LSA), a word-similarity algorithm based on word co-occurrence information, with N-gram models. While LSA is good at predicting content words which are coherent with the rest of a text, it is a bad predictor of frequent words, has a low dynamic range, and is inaccurate when combined linearly with N-grams. We show that modifying the dynamic range, applying a per-word confidence metric, and using geometric rather than linear combinations with N-grams produces a more robust language model which has a lower perplexity on a Wall Street Journal testset than a baseline N-gram model. 1. INTRODUCTION There has been a lot of recent work on augmenting n-gram language models with other information sources such as longer distance syntactic, and semantic constraints (e.g. [8], [6]). In previous ...
Rapid language model development for new task domains
- Proc. First International Conference on Language Resources and Evaluation (LREC
, 1998
"... Data sparseness has been regularly indicted as the primary problem in statistical language modelling. We go one step further to consider the situation when no text data is available for the target domain. We present two techniques for building efficient language models quickly for new domains. The f ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
Data sparseness has been regularly indicted as the primary problem in statistical language modelling. We go one step further to consider the situation when no text data is available for the target domain. We present two techniques for building efficient language models quickly for new domains. The first technique is based on using a context-free grammar to generate a corpus of word collocations. The second is an adaptation technique based on using out-of-domain corpora to estimate target domain language models. We report results of successfully using these two techniques individually and in combination to build efficient models for a spontaneous speech recognition task in a medium-sized vocabulary domain. 1.
Techniques for modelling Phonological Processes in Automatic Speech Recognition
, 2001
"... Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration, except where stated. It has not been submitted in whole or part for a degree at any other university. The length of this thesis including footnotes and appendices does ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration, except where stated. It has not been submitted in whole or part for a degree at any other university. The length of this thesis including footnotes and appendices does not exceed 29,500 words and includes no more than 40 figures. 1 Systems which automatically transcribe carefully dictated speech are now commercially available, but their performance degrades dramatically when the speaking style of users becomes more relaxed or conversational. This dissertation focuses on techniques that aim to improve the robustness of statistical speech transcription systems to conversational speaking styles. The dissertation shows first that the performance degradation occuring as speech becomes more conversational is severe and is partially attributable to differences in the acoustic realizations of sentences. Hypothesizing that the quantifiably wider range of
Discourse Mixture Language Modeling
, 2000
"... Conversational speech recognition is a very challenging task due to the large amount of variability compared to read speech and the corresponding lack of training data. Where sources of variability are systematic, however, recognition performance can be improved by modifying the structure of the lan ..."
Abstract
- Add to MetaCart
Conversational speech recognition is a very challenging task due to the large amount of variability compared to read speech and the corresponding lack of training data. Where sources of variability are systematic, however, recognition performance can be improved by modifying the structure of the language and/or acoustic model, which mainly comprise a speech recognizer. The focus of this thesis is on incorporating the discourse structure of conversational speech into a language model using mixture distributions. We extend previous work in this area with improved estimation techniques that use clustering to reduce model order, class-based smoothing techniques, and a new strategy for unsupervised training to use additional unlabeled data. In addition, we introduce unsupervised dynamic cache adaptation in order to capture topic changes as well as discourse dynamics. Experimental results on the Switchboard corpus show that discourse mixtures give better results than topic mixtures, with the best discourse mixture model giving an 1.9% reduction in word error rate over a trigram language model. Further gains are achieved by adding a dynamic cache.

