Results 1 -
2 of
2
Language Modeling With Sentence-Level Mixtures
, 1994
"... Language models play an important role in improving the accuracy of a continuous speech recognizer. In this thesis, we introduce a new statistical language model which captures long term topic dependencies of words within and across sentences. The model includes two main contributions. First, we dev ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Language models play an important role in improving the accuracy of a continuous speech recognizer. In this thesis, we introduce a new statistical language model which captures long term topic dependencies of words within and across sentences. The model includes two main contributions. First, we develop a topic-dependent sentence-level mixture language model which takes advantage of the topic constraints in a sentence or a paragraph. Since this language model is not Markov and has a large search space, it is used only in the last stage of a multi-pass search strategy in the recognizer. Second, we introduce topic-dependent dynamic adaptation techniques in the framework of the mixture model. During the course of this thesis, we also investigate robust parameter estimation techniques, which are extremely important in light of the sparse data problems in language modeling. The model is implemented in the BU speech recognition system and provides a significant improvement in recognition accuracy. An important advantage of the framework of our model is that it is a simple extension of existing language modeling techniques that can easily be integrated with other language modeling advances.
Language Modeling with Sentence-Level Mixtures
"... This paperintroduces a simple mixtare language model that attempts to capture long distance conslraints in a sentence or paragraph. The model is an m-component mixture of Irigram models. The models were constructed using a 5K vocabulary and trained using a 76 mil-lion word Wail Street Journal text c ..."
Abstract
- Add to MetaCart
This paperintroduces a simple mixtare language model that attempts to capture long distance conslraints in a sentence or paragraph. The model is an m-component mixture of Irigram models. The models were constructed using a 5K vocabulary and trained using a 76 mil-lion word Wail Street Journal text corpus. Using the BU recognition system, experiments show a 7 % improvement in recognition accu-racy with the mixture trigram models as compared to using a Irigram model. 1.

