• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Efficient Sampling and Feature Selection in Whole Sentence Maximum Entropy Language Models

by Stanley F. Chen, Ronald Rosenfeld
Add To MetaCart

Tools

Sorted by:
Results 11 - 16 of 16

Just how good is maximum entropy? An empirical investigation using ensembles of MEMD models for attribute-value grammars

by Miles Osborne
"... Maximum entropy has been theoretically argued as being the principled way to estimate models that are only partially determined by some set of empirically observed constraints. However, such arguments hinge upon large sample behaviour, and it is unclear how well maximum entropy performs when this as ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Maximum entropy has been theoretically argued as being the principled way to estimate models that are only partially determined by some set of empirically observed constraints. However, such arguments hinge upon large sample behaviour, and it is unclear how well maximum entropy performs when this assumption is violated by small samples. Within the maximum entropy / minimum divergence (MEMD) framework, and when operating in the domain of parse selection, we estimate lower and upper bounds on the performance of such models. Maximum entropy, even when samples are small, is shown to produce models near the upper bound. In addition to prediction using single models, we also investigate how well maximum entropy compares with ensembles of MEMD models. Maximum entropy is found to be competitive with such ensembles. Since ensemble learning requires substantially more computational resources than single model learning, yet delivers similar results to maximum entropy, this is a useful finding.

Y.: A Web Recommendation System Based on Maximum Entropy

by Xin Jin, Bamshad Mobasher, Yanzan Zhou - In: Proc. IEEE International Conference on Information Technology Coding and Computing, Las Vegas , 2005
"... We propose a Web recommendation system based on a maximum entropy model. Under the maximum entropy principle, we can combine multiple levels of knowledge about users ’ navigational behavior in order to automatically generate the most effective recommendations for new users with similar profiles. The ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
We propose a Web recommendation system based on a maximum entropy model. Under the maximum entropy principle, we can combine multiple levels of knowledge about users ’ navigational behavior in order to automatically generate the most effective recommendations for new users with similar profiles. The knowledge include the page-level statistics about users’ historically visited pages, and the aggregate usage patterns discovered through Web usage mining. In particular, we use a Web mining framework based on Probabilistic Latent Semantic Analysis to discover the underlying interests of Web users as well as temporal changes in these interests. Our experiments show that our recommendation system can achieve better accuracy when compared to standard approaches, while providing a better interpretation of Web users ’ diverse navigational behavior. 1

Using Perfect Sampling in Parameter Estimation of a Whole Sentence Maximum Entropy Language Model

by F. Amaya, J. M. Benedi , 2000
"... The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Mod ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The Maximum Entropy principle (ME) is an ap- propriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Model (WSME) could be used. Until now MonteCarlo Markov Chains (MCMC) sampling techniques has been used to estimate the paramenters of the WSME model. In this paper, we propose the application of another sampling technique: the Perfect Sampling (PS). The experiment has shown a reduction of 30% in the perplexity of the WSME model over the trigram model and a reduc- tion of 2% over the WSME model trained with MCMC.

Improvement of a Whole Sentence Maximum Entropy Language Model Using Grammatical Features

by Fredy Amaya, José Miguel Benedí , 2001
"... In this paper, we propose adding long-term grammatical information in a Whole Sentence Maximun Entropy Language Model (WSME) in order to improve the performance of the model. The grammatical information was added to the WSME model as features and were obtained from a Stochastic Context-Free ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
In this paper, we propose adding long-term grammatical information in a Whole Sentence Maximun Entropy Language Model (WSME) in order to improve the performance of the model. The grammatical information was added to the WSME model as features and were obtained from a Stochastic Context-Free grammar. Finally, experiments using a part of the Penn Treebank corpus were carried out and significant improvements were acheived.

Improving PinYin to Chinese Conversion With a Whole Sentence . . .

by Le Zhang, Tianshun Yao
"... We address the problem of statistical language modeling in the context of PinYin to Chinese (PTC) conversion, a similar problem to speech recognition but without acoustic recognition step. Inputted phonetic syllables were first segmented and converted into word lattice, which was then scored ..."
Abstract - Add to MetaCart
We address the problem of statistical language modeling in the context of PinYin to Chinese (PTC) conversion, a similar problem to speech recognition but without acoustic recognition step. Inputted phonetic syllables were first segmented and converted into word lattice, which was then scored within a Source-Channel framework in order to find the most probable Chinese sentence. In particular, we discuss the use of a Whole Sentence Maximum Entropy (WSME) model, an expressive framework for constructing language models with diverse features. Experiment showed WSME model trained with d2-ngrams and word triggers achieved a 20% reduction in perplexity and a 11.05% reduction in character conversion error over a baseline trigram.

in Natural Language Processing, pp. 153-159. A Fast Algorithm for Feature Selection in Conditional Maximum Entropy Modeling

by Yaqian Zhou, Lide Wu, Fuliang Weng, Hauke Schmidt
"... This paper describes a fast algorithm that selects features for conditional maximum entropy modeling. Berger et al. (1996) presents an incremental feature selection (IFS) algorithm, which computes the approximate gains for all candidate features at each selection stage, and is very time-consuming fo ..."
Abstract - Add to MetaCart
This paper describes a fast algorithm that selects features for conditional maximum entropy modeling. Berger et al. (1996) presents an incremental feature selection (IFS) algorithm, which computes the approximate gains for all candidate features at each selection stage, and is very time-consuming for any problems with large feature spaces. In this new algorithm, instead, we only compute the approximate gains for the top-ranked features based on the models obtained from previous stages. Experiments on WSJ data in Penn Treebank are conducted to show that the new algorithm greatly speeds up the feature selection process while maintaining the same quality of selected features. One variant of this new algorithm with look-ahead functionality is also tested to further confirm the good quality of the selected features. The new algorithm is easy to implement, and given a feature space of size F, it only uses O(F) more space than the original IFS algorithm. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University