Results 11 -
18 of
18
Do CFG-Based Language Models Need Agreement Constraints?
- in Proceedings of 2nd NAACL
, 2000
"... this technically counts as a reduction of the semantic error rate, it is obviously of little practical importance. After eliminating all examples of the above type, we were left with a residue of 47 utterances where one grammar was right and the other wrong; of these, the tight grammar was correct i ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
this technically counts as a reduction of the semantic error rate, it is obviously of little practical importance. After eliminating all examples of the above type, we were left with a residue of 47 utterances where one grammar was right and the other wrong; of these, the tight grammar was correct in 37 cases and the loose one in the remaining 10. A more realis- tic estimate of the absolute reduction in seman- tic error rate for the OOH system as a result of correctly modelling agreement would thus be (37-10)/3511, or 0.7%, giving a relative reduction of about 5%. Although undramatic, this margin is significant at the 0.05% level according to the McNemar sign test (McNemar, 1947). The following examples show typical instances of the tight grammar (T) outscoring the loose one (L)
unknown title
, 1994
"... This document was created with FrameMaker 4.0.4 Adaptive Statistical Language Modelling 6 ..."
Abstract
- Add to MetaCart
This document was created with FrameMaker 4.0.4 Adaptive Statistical Language Modelling 6
Adaptive Statistical Language Modelling
, 1994
"... The trigram statistical language model is remarkably successful when used in such applications as speech recognition. However, the trigram model is static in that it only considers the previous two words when making a prediction about a future word. The work presented here attempts to improve upon t ..."
Abstract
- Add to MetaCart
The trigram statistical language model is remarkably successful when used in such applications as speech recognition. However, the trigram model is static in that it only considers the previous two words when making a prediction about a future word. The work presented here attempts to improve upon the trigram model by considering additional contextual and longer distance information. This is frequently referred to in the literature as adaptive statistical language modelling because the model is thought of as adapting to the longer term information. This work considers the creation of topic specific models, statistical evidence from the presence or absence of triggers, or related words, in the document history (document triggers) and in the current sentence (in-sentence triggers), and the incorporation of the document cache, which predicts the probability of a word by considering its frequency in the document history. An important result of this work is that the presence of self-triggers, that is, whether or not the word itself occurred in the document history, is an extremely important piece of information. A maximum entropy (ME) approach will be used in many instances to incorporate information from different sources. Maximum entropy considers a model which maximizes entropy while satisfying the constraints presented by the information we wish to incorporate. The generalized iterative scaling (GIS) algorithm can be used to compute the maximum entropy solution. This work also considers various methods of smoothing the information in a maximum entropy model. An inportant result is that smoothing improves performance noticibly and that Good-Turing discounting is an effective method of smoothing. Thesis Supervisor: Victor Zue Title: Principal Research Scientist, Departme...
Less is More: Significance-Based N-gram Selection for Smaller, Better Language Models
"... The recent availability of large corpora for training N-gram language models has shown the utility of models of higher order than just trigrams. In this paper, we investigate methods to control the increase in model size resulting from applying standard methods at higher orders. We introduce signifi ..."
Abstract
- Add to MetaCart
The recent availability of large corpora for training N-gram language models has shown the utility of models of higher order than just trigrams. In this paper, we investigate methods to control the increase in model size resulting from applying standard methods at higher orders. We introduce significance-based N-gram selection, which not only reduces model size, but also improves perplexity for several smoothing methods, including Katz backoff and absolute discounting. We also show that, when combined with a new smoothing method and a novel variant of weighted-difference pruning, our selection method performs better in the trade-off between model size and perplexity than the best pruning method we found for modified Kneser-Ney smoothing. 1
Technologies (ICADIWT 2008), Workshop on Recommender Systems and Personalized Retrieval (RSPR) (2008)" Collaborative Filtering inspired from Language Modeling
, 2008
"... Recommender systems filter resources for a given user by predicting the most pertinent item given a specific context. This paper describes a new approach of generating suitable recommendations based on the active user’s navigation stream. The underlying hypothesis is that the items order in the stre ..."
Abstract
- Add to MetaCart
Recommender systems filter resources for a given user by predicting the most pertinent item given a specific context. This paper describes a new approach of generating suitable recommendations based on the active user’s navigation stream. The underlying hypothesis is that the items order in the stream results from the intrinsic logic of the user’s behavior. We show similarities between natural language and Internet navigation and put forward navigation specificities. We then design a new model that integrates advantages of statistical language models such as n-grams and triggers to compute recommendations. The resulting Sequence Based Recommender has been tested on Internet navigation artificial corpora. 1

