• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Optimizing Search Engines using Clickthrough Data (2002)

Cached

  • Download as a PDF

Download Links

  • [www.cse.unsw.edu.au]
  • [www.joachims.org]
  • [www-poleia.lip6.fr]
  • [www.cs.cornell.edu]
  • [www.cs.cornell.edu]
  • [www-connex.lip6.fr]
  • [www.inf.unibz.it]
  • [www.cse.msu.edu]
  • [www.cs.cornell.edu]
  • [www.cs.cornell.edu]
  • [widit.slis.indiana.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Thorsten Joachims
Citations:1311 - 23 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Joachims02optimizingsearch,
    author = {Thorsten Joachims},
    title = {Optimizing Search Engines using Clickthrough Data},
    year = {2002}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. Intuitively, a good information retrieval system should present relevant documents high in the ranking, with less relevant documents following below. While previous approaches to learning retrieval functions from examples exist, they typically require training data generated from relevance judgments by experts. This makes them difficult and expensive to apply. The goal of this paper is to develop a method that utilizes clickthrough data for training, namely the query-log of the search engine in connection with the log of links the users clicked on in the presented ranking. Such clickthrough data is available in abundance and can be recorded at very low cost. Taking a Support Vector Machine (SVM) approach, this paper presents a method for learning retrieval functions. From a theoretical perspective, this method is shown to be well-founded in a risk minimization framework. Furthermore, it is shown to be feasible even for large sets of queries and features. The theoretical results are verified in a controlled experiment. It shows that the method can effectively adapt the retrieval function of a meta-search engine to a particular group of users, outperforming Google in terms of retrieval quality after only a couple of hundred training examples.

Keyphrases

clickthrough data    search engine    retrieval function    retrieval quality    relevant document    meta-search engine    presented ranking    support vector machine    particular group    low cost    theoretical perspective    controlled experiment    good information retrieval system    large set    previous approach    theoretical result    hundred training example    risk minimization framework    relevance judgment   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University