Results 1 -
7 of
7
Learning Random Walk Models for Inducing Word Dependency Distributions
- IN ICML
, 2004
"... Many NLP tasks rely on accurately estimating word dependency probabilities P(w 1 |w 2 ), where the words w 1 and w 2 have a particular relationship (such as verb-object). Because of the sparseness of counts of such dependencies, smoothing and the ability to use multiple sources of knowledge ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
Many NLP tasks rely on accurately estimating word dependency probabilities P(w 1 |w 2 ), where the words w 1 and w 2 have a particular relationship (such as verb-object). Because of the sparseness of counts of such dependencies, smoothing and the ability to use multiple sources of knowledge are important challenges. For example, if the probability P(N ) of noun N being the subject of verb V is high, and V takes similar objects to V # , and V # is synonymous to V ## , then we want to conclude that P(N ## ) should also be reasonably high---even when those words did not cooccur in the training data. To capture
Combining Naive Bayes and n-Gram Language Models for Text Classification
- In 25th European Conference on Information Retrieval Research (ECIR
, 2003
"... We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers. ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers.
Incorporating Query Term Dependencies in Language Models for Document Retrieval
, 2003
"... Introduction Recent advances in Information Retrieval are based on using Statistical Language Models (SLM) for representing documents and evaluating their relevance to user queries [6, 3, 4]. Language Modeling (LM) has been explored in many natural language tasks including machine translation and s ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Introduction Recent advances in Information Retrieval are based on using Statistical Language Models (SLM) for representing documents and evaluating their relevance to user queries [6, 3, 4]. Language Modeling (LM) has been explored in many natural language tasks including machine translation and speech recognition [1]. In LM approach to document retrieval, each document, D, is viewed to have its own language model, MD . Given a query, Q, documents are ranked based on the probability, P (Q|MD ), of their language model generating the query. While the LM approach to information retrieval has been motivated from di#erent perspectives [3, 4], most experiments have used smoothed unigram language models that assume term independence for estimating document language models. N-gram, specifically, bigram language models that capture context provided by the previous word(s) perform better than unigram models [7]. Biterm language models [8] that ignore the word order constraint in bigram langu
Session Boundary Detection for Association Rule Learning Using n-Gram Language Models
"... We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data.
Session Boundary Detection for Association
"... We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data. ..."
Abstract
- Add to MetaCart
We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data.
Learning Random Walk Models for
- In ICML
, 2004
"... Many NLP tasks rely on accurately estimating word dependency probabilities P(w 1 |w 2 ), where the words w 1 and w 2 have a particular relationship (such as verb-object). Because of the sparseness of counts of such dependencies, smoothing and the ability to use multiple sources of knowledge ..."
Abstract
- Add to MetaCart
Many NLP tasks rely on accurately estimating word dependency probabilities P(w 1 |w 2 ), where the words w 1 and w 2 have a particular relationship (such as verb-object). Because of the sparseness of counts of such dependencies, smoothing and the ability to use multiple sources of knowledge are important challenges. For example, if the probability P(N ) of noun N being the subject of verb V is high, and V takes similar objects to V # , and V # is synonymous to V ## , then we want to conclude that P(N ## ) should also be reasonably high---even when those words did not cooccur in the training data.
Combining Statistical Language Models via the Latent Maximum Entropy Principle
- Machine Learning
, 2005
"... In this paper, we present a unified probabilistic framework for statistical language modeling which can simultaneously incorporate various aspects of natural language, such as local word interaction, syntactic structure and semantic document information. Our approach is based on a recent statistical ..."
Abstract
- Add to MetaCart
In this paper, we present a unified probabilistic framework for statistical language modeling which can simultaneously incorporate various aspects of natural language, such as local word interaction, syntactic structure and semantic document information. Our approach is based on a recent statistical inference principle we have proposed---the latent maximum entropy principle---which allows relationships over hidden features to be e#ectively captured in a unified model. Our work extends previous research on maximum entropy methods for language modeling, which only allow observed features to be modeled. The ability to conveniently incorporate hidden variables allows us to extend the expressiveness of language models while alleviating the necessity of pre-processing the data to obtain explicitly observed features. We describe e#cient algorithms for marginalization, inference and normalization in our extended models. We then use these techniques to combine two standard forms of language models: local lexical models (Markov N-gram models) and global document-level semantic models (probabilistic latent semantic analysis). Our experimental results on the Wall Street Journal corpus show that we obtain a 21.9% reduction in perplexity compared to the baseline tri-gram model with Good-Turing smoothing.

