@MISC{Post_languagemodeling, author = {Matt Post and Daniel Gildea}, title = {Language Modeling with Tree Substitution Grammars}, year = {} }
Bookmark
OpenURL
Abstract
We show that a tree substitution grammar (TSG) induced with a collapsed Gibbs sampler results in lower perplexity on test data than both a standard context-free grammar and other heuristically trained TSGs, suggesting that it is better suited to language modeling. Training a more complicated bilexical parsing model across TSG derivations shows further (though nuanced) improvement. We conduct analysis and point to future areas of research using TSGs as language models. 1