• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Two-Stage language models for information retrieval (2002)

Cached

  • Download as a PDF

Download Links

  • [lipn.univ-paris13.fr]
  • [sifaka.cs.uiuc.edu]
  • [www.cs.cmu.edu]
  • [www-2.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www-2.cs.cmu.edu]
  • [www-2.cs.cmu.edu]
  • [ciir.cs.umass.edu]
  • [maroo.cs.umass.edu]
  • [ciir-publications.cs.umass.edu]
  • [maroo.cs.umass.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Chengxiang Zhai , John Lafferty
Venue:In: Proc. of the 25th ACM SIGIR Conf
Citations:265 - 20 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Zhai02two-stagelanguage,
    author = {Chengxiang Zhai and John Lafferty},
    title = {Two-Stage language models for information retrieval},
    booktitle = {In: Proc. of the 25th ACM SIGIR Conf},
    year = {2002},
    pages = {49--56},
    publisher = {ACM}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

The optimal settings of retrieval parameters often depend on both the document collection and the query, and are usually found through empirical tuning. In this paper, we propose a family of two-stage language models for information retrieval that explicitly captures the different influences of the query and document collection on the optimal settings of retrieval parameters. As a special case, we present a two-stage smoothing method that allows us to estimate the smoothing parameters completely automatically. In the first stage, the document language model is smoothed using a Dirichlet prior with the collection language model as the reference model. In the second stage, the smoothed document language model is further in-terpolated with a query background language model. We propose a leave-one-out method for estimating the Dirichlet parameter of the first stage, and the use of document mixture models for estimating the interpolation parameter of the second stage. Evaluation on five different databases and four types of queries indicates that the two-stage smoothing method with the proposed parameter estimation methods consistently gives retrieval performance that is close to— or better than—the best results achieved using a single smoothing method and exhaustive parameter search on the test data.

Keyphrases

two-stage language model    information retrieval    document collection    optimal setting    second stage    first stage    retrieval parameter    leave-one-out method    test data    document language model    query background language model    retrieval performance    document mixture model    special case    parameter estimation method    dirichlet prior    dirichlet parameter    smoothed document language model    different database    exhaustive parameter search    single smoothing method    interpolation parameter    reference model    collection language model    different influence    empirical tuning   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University