• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

ánchez. Combination of n-grams and stochastic context-free grammars for language modeling (0)

by J M Bened´ı, J A S
Venue:In Proceedings of COLING
Add To MetaCart

Tools

Sorted by:
Results 1 - 4 of 4

Lexical Decoding Based on the Combination of Category-Based Stochastic Models and Word-Category Distribution Models

by Francisco Nevado, José-Miguel Benedí, Joan Andreu Sánchez , 2001
"... Lexical decoding is the obtaining of the most probable sequence of categories associated to a sequence of words. This paper describes two lexical decoding combined models which are based on a stochastic category-based model and a probabilistic model of word distribution into linguistic categories ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Lexical decoding is the obtaining of the most probable sequence of categories associated to a sequence of words. This paper describes two lexical decoding combined models which are based on a stochastic category-based model and a probabilistic model of word distribution into linguistic categories. In the rst combined model, the stochastic category-based model is a Stochastic ContextFree Grammar, and in the second combined model, the stochastic categorybased model is a n-gram model. The estimation processes of the models are described in detail. Finally, experiments on the Wall Street Journal corpus are reported.

Using Perfect Sampling in Parameter Estimation of a Whole Sentence Maximum Entropy Language Model

by F. Amaya, J. M. Benedi , 2000
"... The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Mod ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Model (WSME) could be used. Until now MonteCarlo Markov Chains (MCMC) sampling techniques has been used to estimate the paramenters of the WSME model. In this paper, we propose the application of another sampling technique: the Perfect Sampling (PS). The experiment has shown a reduction of 30% in the perplexity of the WSME model over the trigram model and a reduc- tion of 2% over the WSME model trained with MCMC.

corpora and

by Diego Linares, José-miguel Benedí, Joan-andreu Sánchez
"... stochastic context-free grammar estimation from bracketed ..."
Abstract - Add to MetaCart
stochastic context-free grammar estimation from bracketed

A Hybrid Language Model based on Stochastic Context-free Grammars ⋆

by Diego Linares, José-miguel Benedí, Joan-andreu Sánchez, Javeriana Cali
"... Abstract. This paper explores the use of initial Stochastic Context-Free Grammars (SCFG) obtained from a treebank corpus for the learning of SCFG by means of estimation algorithms. A hybrid language model is defined as a combination of a word-based n-gram, which is used to capture the local relation ..."
Abstract - Add to MetaCart
Abstract. This paper explores the use of initial Stochastic Context-Free Grammars (SCFG) obtained from a treebank corpus for the learning of SCFG by means of estimation algorithms. A hybrid language model is defined as a combination of a word-based n-gram, which is used to capture the local relations between words, and a category-based SCFG with a word distribution into categories, which is defined to represent the long-term relations between these categories. Experiments on the UPenn Treebank corpus are reported. These experiments have been carried out in terms of the test set perplexity and the word error rate in a speech recognition experiment. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University