Results 1 -
3 of
3
Integrating history-length interpolation and classes in language modeling
"... Building on earlier work that integrates different factors in language modeling, we view (i) backing off to a shorter history and (ii) class-based generalization as two complementary mechanisms of using a larger equivalence class for prediction when the default equivalence class is too small for rel ..."
Abstract
- Add to MetaCart
Building on earlier work that integrates different factors in language modeling, we view (i) backing off to a shorter history and (ii) class-based generalization as two complementary mechanisms of using a larger equivalence class for prediction when the default equivalence class is too small for reliable estimation. This view entails that the classes in a language model should be learned from rare events only and should be preferably applied to rare events. We construct such a model and show that both training on rare events and preferable application to rare events improve perplexity when compared to a simple direct interpolation of class-based with standard language models. 1
Half-Context Language Models
"... This article investigates the effects of different degrees of contextual granularity on language model performance. It presents a new language model that combines clustering and halfcontextualization, a novel representation of contexts. Half-contextualization is based on the halfcontext hypothesis t ..."
Abstract
- Add to MetaCart
This article investigates the effects of different degrees of contextual granularity on language model performance. It presents a new language model that combines clustering and halfcontextualization, a novel representation of contexts. Half-contextualization is based on the halfcontext hypothesis that states that the distributional characteristics of a word or bigram are best represented by treating its context distribution to the left and right separately and that only directionally relevant distributional information should be used. Clustering is achieved using a new clustering algorithm for class-based language models that compares favorably to the exchange algorithm. When interpolated with a Kneser-Ney model, half-context models are shown to have better perplexity than commonly used interpolated n-gram models and traditional class-based approaches. A novel, fine-grained, context-specific analysis highlights those contexts in which the model performs well and those which are better treated by existing non-class-based models. 1.
Stochastic K-TSS bi-languages for Machine Translation
"... One of the approaches to statistical machine translation is based on joint probability distributions over some source and target languages. In this work we propose to model the joint probability distribution by stochastic regular bi-languages. Specifically we introduce the stochastic k-testable in t ..."
Abstract
- Add to MetaCart
One of the approaches to statistical machine translation is based on joint probability distributions over some source and target languages. In this work we propose to model the joint probability distribution by stochastic regular bi-languages. Specifically we introduce the stochastic k-testable in the strict sense bi-languages to represent the joint probability distribution of source and target languages. With this basis we present a reformulation of the GIATI methodology to infer stochastic regular bi-languages for machine translation purposes. 1

