Results 1 -
4 of
4
Lexical Decoding Based on the Combination of Category-Based Stochastic Models and Word-Category Distribution Models
, 2001
"... Lexical decoding is the obtaining of the most probable sequence of categories associated to a sequence of words. This paper describes two lexical decoding combined models which are based on a stochastic category-based model and a probabilistic model of word distribution into linguistic categories ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Lexical decoding is the obtaining of the most probable sequence of categories associated to a sequence of words. This paper describes two lexical decoding combined models which are based on a stochastic category-based model and a probabilistic model of word distribution into linguistic categories. In the rst combined model, the stochastic category-based model is a Stochastic ContextFree Grammar, and in the second combined model, the stochastic categorybased model is a n-gram model. The estimation processes of the models are described in detail. Finally, experiments on the Wall Street Journal corpus are reported.
Using Perfect Sampling in Parameter Estimation of a Whole Sentence Maximum Entropy Language Model
, 2000
"... The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Mod ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Model (WSME) could be used. Until now MonteCarlo Markov Chains (MCMC) sampling techniques has been used to estimate the paramenters of the WSME model. In this paper, we propose the application of another sampling technique: the Perfect Sampling (PS). The experiment has shown a reduction of 30% in the perplexity of the WSME model over the trigram model and a reduc- tion of 2% over the WSME model trained with MCMC.
A Hybrid Language Model based on Stochastic Context-free Grammars ⋆
"... Abstract. This paper explores the use of initial Stochastic Context-Free Grammars (SCFG) obtained from a treebank corpus for the learning of SCFG by means of estimation algorithms. A hybrid language model is defined as a combination of a word-based n-gram, which is used to capture the local relation ..."
Abstract
- Add to MetaCart
Abstract. This paper explores the use of initial Stochastic Context-Free Grammars (SCFG) obtained from a treebank corpus for the learning of SCFG by means of estimation algorithms. A hybrid language model is defined as a combination of a word-based n-gram, which is used to capture the local relations between words, and a category-based SCFG with a word distribution into categories, which is defined to represent the long-term relations between these categories. Experiments on the UPenn Treebank corpus are reported. These experiments have been carried out in terms of the test set perplexity and the word error rate in a speech recognition experiment. 1

