Shrinking exponential language models (2009)
| Venue: | In Proc. of HLT-NAACL |
| Citations: | 8 - 2 self |
BibTeX
@INPROCEEDINGS{Chen09shrinkingexponential,
author = {Stanley F. Chen},
title = {Shrinking exponential language models},
booktitle = {In Proc. of HLT-NAACL},
year = {2009}
}
OpenURL
Abstract
In (Chen, 2009), we show that for a variety of language models belonging to the exponential family, the test set cross-entropy of a model can be accurately predicted from its training set cross-entropy and its parameter values. In this work, we show how this relationship can be used to motivate two heuristics for “shrinking ” the size of a language model to improve its performance. We use the first heuristic to develop a novel class-based language model that outperforms a baseline word trigram model by 28 % in perplexity and 1.9% absolute in speech recognition word-error rate on Wall Street Journal data. We use the second heuristic to motivate a regularized version of minimum discrimination information models and show that this method outperforms other techniques for domain adaptation. 1







