## Selection Criteria for Word Trigger Pairs in Language Modeling (1996)

Venue: | In ICGI’96 |

Citations: | 5 - 1 self |

### Abstract

. In this paper, we study selection criteria for the use of word trigger pairs in statistical language modeling. A word trigger pair is defined as a long-distance word pair. To select the most significant trigger pairs, we need suitable criteria which are the topics of this paper. We extend a baseline language model by a single word trigger pair and use the perplexity of this extended language model as selection criterion. This extension is applied to all possible trigger pairs, the number of which is the square of the vocabulary size. When using a unigram language model as baseline model, this approach produces the mutual information criterion used in [7, 11]. The more interesting case is to use this criterion for a more powerful model such as a bigram/trigram model with a cache. We study different variants for including word trigger pairs into such a language model. This approach produced better word trigger pairs than the usual mutual information criterion. When used on...

