## A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval (2001)

### Cached

### Download Links

- [www-2.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.iro.umontreal.ca]
- [sifaka.cs.uiuc.edu]
- [www.cs.cmu.edu]
- [sifaka.cs.uiuc.edu]
- [www.cs.cmu.edu]
- [www-2.cs.cmu.edu]
- [www.aladdin.cs.cmu.edu]
- [www-poleia.lip6.fr]
- [www-connex.lip6.fr]
- [sifaka.cs.uiuc.edu]
- [sifaka.cs.uiuc.edu]
- [hachita.nmsu.edu]
- DBLP

### Other Repositories/Bibliography

Citations: | 698 - 37 self |

### BibTeX

@INPROCEEDINGS{Zhai01astudy,

author = {Chengxiang Zhai and John Lafferty},

title = {A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval},

booktitle = {},

year = {2001},

pages = {334--342}

}

### Years of Citing Articles

### OpenURL

### Abstract

Language modeling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and then rank documents by the likelihood of the query according to the estimated language model. A core problem in language model estimation is smoothing , which adjusts the maximum likelihood estimator so as to correct the inaccuracy due to data sparseness. In this paper, we study the problem of language model smoothing and its influence on retrieval performance. We examine the sensitivity of retrieval performance to the smoothing parameters and compare several popular smoothing methods on different test collections.

### Citations

1521 | Term-weighting approaches in automatic text retrieval
- Salton, Buckley
- 1988
(Show Context)
Citation Context ...s kinds of logic models and probabilistic models (e.g., [14, 3, 15, 22]). On the other hand, there have been many empirical studies of models, including many variants of the vector space model (e.g., =-=[17, 18, 19]-=-). In some cases, there have been theoretically motivated models that also perform well empirically; for example, the BM25 retrieval function, motivated by the 2-Poisson probabilistic retrieval model,... |

880 | A language modeling approach to information retrieval
- Ponte, Croft, et al.
- 1998
(Show Context)
Citation Context ...s how well the document \ts" the particular query q. In the simplest case, p(d) is assumed to be uniform, and so does not aect document ranking. This assumption has been taken in most existing wo=-=rk [1, 13, 12, 5, 20]-=-. In other cases, p(d) can be used to capture non-textual information, e.g., the length of a document or links in a web page, as well as other format/style features of a document. In our study, we ass... |

850 | An empirical study of smoothing technique for language modeling
- Chen, Goodman
- 1999
(Show Context)
Citation Context ...ity to the unseen words and improve the accuracy of word probability estimation in general. There are many smoothing methods that have been proposed, mostly in the context of speech recognition tasks =-=[2]-=-. In general, all smoothing methods are trying to discount the probabilities of the words seen in the text, and to then assign the extra probability mass to the unseen words according to some \fallbac... |

824 | A vector space model for automatic indexing - Salton, Wong, et al. - 1975 |

663 | Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer
- Katz
- 1987
(Show Context)
Citation Context ...ained by the eciency of the smoothing method. We selected three representative methods that are popular and relatively ecient to implement. We excluded some well-known methods, such as Katz smoothing =-=[7]-=- and Good-Turing estimation [4], because of the eciency constraint 2 . Although the methods we evaluated are simple, the issues that they bring to light are relevant to more advanced methods. The thre... |

599 | Improving retrieval performance by relevance feedback
- Salton, Buckley
- 1990
(Show Context)
Citation Context ...s kinds of logic models and probabilistic models (e.g., [14, 3, 15, 22]). On the other hand, there have been many empirical studies of models, including many variants of the vector space model (e.g., =-=[17, 18, 19]-=-). In some cases, there have been theoretically motivated models that also perform well empirically; for example, the BM25 retrieval function, motivated by the 2-Poisson probabilistic retrieval model,... |

448 | Okapi at TREC-3 - Robertson, Walker, et al. - 1995 |

364 | Pivoted document length normalization
- Singhal, Buckley, et al.
- 1996
(Show Context)
Citation Context ...s kinds of logic models and probabilistic models (e.g., [14, 3, 15, 22]). On the other hand, there have been many empirical studies of models, including many variants of the vector space model (e.g., =-=[17, 18, 19]-=-). In some cases, there have been theoretically motivated models that also perform well empirically; for example, the BM25 retrieval function, motivated by the 2-Poisson probabilistic retrieval model,... |

352 |
The population frequencies of species and the estimation of population parameters
- Good
- 1953
(Show Context)
Citation Context ...othing method. We selected three representative methods that are popular and relatively ecient to implement. We excluded some well-known methods, such as Katz smoothing [7] and Good-Turing estimation =-=[4]-=-, because of the eciency constraint 2 . Although the methods we evaluated are simple, the issues that they bring to light are relevant to more advanced methods. The three methods are described below. ... |

336 | Interpolated estimation of markov source parameters from sparse data - Jelinek, Mercer - 1980 |

320 | Relevance-based language models - Lavrenko, Croft - 2001 |

304 | Document language models, query models, and risk minimization for information retrieval - LAFFERTY, C |

271 | Information Retrieval as statistical translation
- Berger, Lafferty
- 1999
(Show Context)
Citation Context ...rs or to redistribute to lists, requires prior specific permission and/or a fee. SIGIR'01, September 9-12, 2001, New Orleans, Louisiana, USA Copyright 2001 ACM 1-58113-331-6/01/0009 ...$5.00. trieval =-=[13, 1, 10, 5]-=-. The basic idea behind the new approach is extremely simple|estimate a language model for each document, and rank documents by the likelihood of the query according to the language model. Yet this ne... |

271 | Improved backing-off for m-gram language modeling - Kneser, Ney - 1995 |

193 | A general language model for information retrieval
- Song, Croft
- 1999
(Show Context)
Citation Context ...s how well the document \ts" the particular query q. In the simplest case, p(d) is assumed to be uniform, and so does not aect document ranking. This assumption has been taken in most existing wo=-=rk [1, 13, 12, 5, 20]-=-. In other cases, p(d) can be used to capture non-textual information, e.g., the length of a document or links in a web page, as well as other format/style features of a document. In our study, we ass... |

189 |
A hidden markov model information retrieval system
- Miller, Leek, et al.
- 1999
(Show Context)
Citation Context ...rs or to redistribute to lists, requires prior specific permission and/or a fee. SIGIR'01, September 9-12, 2001, New Orleans, Louisiana, USA Copyright 2001 ACM 1-58113-331-6/01/0009 ...$5.00. trieval =-=[13, 1, 10, 5]-=-. The basic idea behind the new approach is extremely simple|estimate a language model for each document, and rank documents by the likelihood of the query according to the language model. Yet this ne... |

176 | A non-classical logic for information retrieval - Rijsbergen - 1986 |

169 |
On structuring probabilistic dependences in stochastic language modeling
- Ney, Essen, et al.
- 1994
(Show Context)
Citation Context ...ace method is a special case of this technique. Absolute discounting. The idea of the absolute discounting method is to lower the probability of seen words by subtracting a constant from their counts =-=[1-=-1]. It is similar to the Jelinek-Mercer method, but diers in that it discounts the seen word probability by subtracting a constant instead of multiplying it by (1-). The model is given by ps(w j d) = ... |

128 | The importance of prior probabilities for entry page search - Kraaij, Westerveld, et al. - 2002 |

107 | Twenty-One at TREC-7: Ad-hoc and cross-language track
- Hiemstra, Kraaij
- 1998
(Show Context)
Citation Context ...rs or to redistribute to lists, requires prior specific permission and/or a fee. SIGIR'01, September 9-12, 2001, New Orleans, Louisiana, USA Copyright 2001 ACM 1-58113-331-6/01/0009 ...$5.00. trieval =-=[13, 1, 10, 5]-=-. The basic idea behind the new approach is extremely simple|estimate a language model for each document, and rank documents by the likelihood of the query according to the language model. Yet this ne... |

103 | Probabilistic models in information retrieval
- Fuhr
- 1992
(Show Context)
Citation Context ...ways. On the one hand, theoretical studies of an underlying model have been developed; this direction is, for example, represented by the various kinds of logic models and probabilistic models (e.g., =-=[14, 3, 15, 22]-=-). On the other hand, there have been many empirical studies of models, including many variants of the vector space model (e.g., [17, 18, 19]). In some cases, there have been theoretically motivated m... |

100 |
On modeling information retrieval with probabilistic inference
- Wong, Yao
- 1995
(Show Context)
Citation Context ...ways. On the one hand, theoretical studies of an underlying model have been developed; this direction is, for example, represented by the various kinds of logic models and probabilistic models (e.g., =-=[14, 3, 15, 22]-=-). On the other hand, there have been many empirical studies of models, including many variants of the vector space model (e.g., [17, 18, 19]). In some cases, there have been theoretically motivated m... |

79 | A hierarchical Dirichlet language model - MacKay, Peto - 1994 |

60 | Model-based feedback in the kl-divergence retrieval model - Zhai, Lafferty |

51 | On the Estimation of ‘Small’ Probabilities by LeavingOne-Out - Ney, Essen, et al. - 1995 |

49 |
Probabilistic models of indexing and searching
- Robertson, Rijsbergen, et al.
- 1981
(Show Context)
Citation Context ...ways. On the one hand, theoretical studies of an underlying model have been developed; this direction is, for example, represented by the various kinds of logic models and probabilistic models (e.g., =-=[14, 3, 15, 22]-=-). On the other hand, there have been many empirical studies of models, including many variants of the vector space model (e.g., [17, 18, 19]). In some cases, there have been theoretically motivated m... |

41 | Improving two-stage ad-hoc retrieval for short queries - Kwok, Chan - 1998 |

13 |
Rijsbergen
- van
- 1979
(Show Context)
Citation Context |

8 |
Improved smoothing for m-gram language modeling
- Kneser, Ney
- 1995
(Show Context)
Citation Context ...mplement the role of query modeling. Finally, there are many other eective smoothing algorithms that we have not yet tested (e.g., Good-Turing smoothing [4], Katz smoothing [7], Kneser-Ney smoothing [=-=8]-=-); evaluation of them would be a natural further research direction. It is also very important to study how to exploit the past relevance judgments, the current query, and the current database to trai... |

8 | Interpolated estimation of markov sourceparameters from sparse data - Jelinek, Mercer - 1980 |

3 | A hierarchical Dirichlet language - MACKAY, L - 1995 |

2 |
Okapi at TREC-3," The Third Text REtrieval
- Robertson, Walker, et al.
- 1995
(Show Context)
Citation Context ...ly motivated models that also perform well empirically; for example, the BM25 retrieval function, motivated by the 2-Poisson probabilistic retrieval model, has proven to be quite eective in practice [=-=16]-=-. Recently, a new approach based on language modeling has been successfully applied to the problem of ad hoc rePermission to make digital or hard copies of all or part of this work for personal or cla... |

1 | A Study of Smoothing Methods for Language Models 33 - LAVRENKO, CROFT - 2001 |

1 |
Okapi at TREC-3,” The Third Text REtrieval
- Robertson, Walker, et al.
- 1995
(Show Context)
Citation Context ...y motivated models that also perform well empirically; for example, the BM25 retrieval function, motivated by the 2-Poisson probabilistic retrieval model, has proven to be quite effective in practice =-=[16]-=-. Recently, a new approach based on language modeling has been successfully applied to the problem of ad hoc rePermission to make digital or hard copies of all or part of this work for personal or cla... |