Results 1 -
1 of
1
A Hierarchical Dirichlet Language Model
- Natural Language Engineering
, 1994
"... We discuss a hierarchical probabilistic model whose predictions are similar to those of the popular language modelling procedure known as `smoothing'. A number of interesting differences from smoothing emerge. The insights gained from a probabilistic view of this problem point towards new directions ..."
Abstract
-
Cited by 66 (3 self)
- Add to MetaCart
We discuss a hierarchical probabilistic model whose predictions are similar to those of the popular language modelling procedure known as `smoothing'. A number of interesting differences from smoothing emerge. The insights gained from a probabilistic view of this problem point towards new directions for language modelling. The ideas of this paper are also applicable to other problems such as the modelling of triphomes in speech, and DNA and protein sequences in molecular biology. The new algorithm is compared with smoothing on a two million word corpus. The methods prove to be about equally accurate, with the hierarchical model using fewer computational resources. Contents 1 Introduction 2 1.1 The bigram language model with smoothing 2 1.2 Any rational predictive procedure can be made Bayesian 3 2 An explicit model using Dirichlet priors 4 2.1 The inferences we will make 4 2.2 The likelihood function 5 2.3 What prior? 5 2.4 A convenient family of priors: Dirichlet distributions 5 2.5 ...

