Results 1 - 10
of
14
From RankNet to LambdaRank to LambdaMART: An Overview
"... LambdaMART is the boosted tree version of LambdaRank, which is based on RankNet. RankNet, LambdaRank, and LambdaMART have proven to be very successful algorithms for solving real world ranking problems: for example an ensemble of LambdaMART rankers won Track 1 of the 2010 Yahoo! Learning To Rank Cha ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
LambdaMART is the boosted tree version of LambdaRank, which is based on RankNet. RankNet, LambdaRank, and LambdaMART have proven to be very successful algorithms for solving real world ranking problems: for example an ensemble of LambdaMART rankers won Track 1 of the 2010 Yahoo! Learning To Rank Challenge. The details of these algorithms are spread across several papers and reports, and so here we give a self-contained, detailed and complete description of them. 1
Classification-enhanced ranking
, 2010
"... Many have speculated that classifying web pages can improve a search engine’s ranking of results. Intuitively results should be more relevant when they match the class of a query. We present a simple framework for classification-enhanced ranking that uses clicks in combination with the classificatio ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Many have speculated that classifying web pages can improve a search engine’s ranking of results. Intuitively results should be more relevant when they match the class of a query. We present a simple framework for classification-enhanced ranking that uses clicks in combination with the classification of web pages to derive a class distribution for the query. We then go on to define a variety of features that capture the match between the class distributions of a web page and a query, the ambiguity of a query, and the coverage of a retrieved result relative to a query’s set of classes. Experimental results demonstrate that a ranker learned with these features significantly improves ranking over a competitive baseline. Furthermore, our methodology is agnostic with respect to the classification space and can be used to derive query classes for a variety of different taxonomies.
A Machine Learning Approach for Improved BM25 Retrieval
"... Despite the widespread use of BM25, there have been few studies examining its effectiveness on a document description over single and multiple field combinations. We determine the effectiveness of BM25 on various document fields. We find that BM25 models relevance on popularity fields such as anchor ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Despite the widespread use of BM25, there have been few studies examining its effectiveness on a document description over single and multiple field combinations. We determine the effectiveness of BM25 on various document fields. We find that BM25 models relevance on popularity fields such as anchor text and query click information no better than a linear function of the field attributes. We also find query click information to be the single most important field for retrieval. In response, we develop a machine learning approach to BM25-style retrieval that learns, using LambdaRank, from the input attributes of BM25. Our model significantly improves retrieval effectiveness over BM25 and BM25F. Our data-driven approach is fast, effective, avoids the problem of parameter tuning, and can directly optimize for several common information retrieval measures. We demonstrate the advantages of our model on a very large real-world Web data collection.
Extending Average Precision to Graded Relevance Judgments
"... Evaluation metrics play a critical role both in the context of comparative evaluation of the performance of retrieval systems and in the context of learning-to-rank (LTR) as objective functions to be optimized. Many different evaluation metrics have been proposed in the IR literature, with average p ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Evaluation metrics play a critical role both in the context of comparative evaluation of the performance of retrieval systems and in the context of learning-to-rank (LTR) as objective functions to be optimized. Many different evaluation metrics have been proposed in the IR literature, with average precision (AP) being the dominant one due a number of desirable properties it possesses. However, most of these measures, including average precision, do not incorporate graded relevance. In this work, we propose a new measure of retrieval effectiveness, the Graded Average Precision (GAP). GAP generalizes average precision to the case of multi-graded relevance and inherits all the desirable characteristics of AP: it has a nice probabilistic interpretation, it approximates the area
Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data
"... Leveraging clickthrough data has become a popular approach for evaluating and optimizing information retrieval systems. Although data is plentiful, one must take care when interpreting clicks, since user behavior can be affected by various sources of presentation bias. While the issue of position bi ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Leveraging clickthrough data has become a popular approach for evaluating and optimizing information retrieval systems. Although data is plentiful, one must take care when interpreting clicks, since user behavior can be affected by various sources of presentation bias. While the issue of position bias in clickthrough data has been the topic of much study, other presentation bias effects have received comparatively little attention. For instance, since users must decide whether to click on a result based on its summary (e.g., the title, URL and abstract), one might expect clicks to favor “more attractive ” results. In this paper, we examine result summary attractiveness as a potential source of presentation bias. This study distinguishes itself from prior work by aiming to detect systematic biases in click behavior due to attractive summaries inflating perceived relevance. Our experiments conducted on the Google web search engine show substantial evidence of presentation bias in clicks towards results with more attractive titles.
How good is a span of terms? Exploiting proximity to improve web retrieval
- In Proceedings of SIGIR 2010
"... Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval accuracy. We build on existing research by incorporating novel ranking features based on flexible proximity terms with rece ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval accuracy. We build on existing research by incorporating novel ranking features based on flexible proximity terms with recent state-of-the-art machine learning ranking models. We introduce a method of determining the goodness of a set of proximity terms that takes advantage of the structured nature of web documents, document metadata, and phrasal information from search engine user query logs. We perform experiments on a large real-world Web data collection and show that using the goodness score of flexible proximity terms can improve ranking accuracy over state-ofthe-art ranking methods by as much as 13%. We also show that we can improve accuracy on the hardest queries by as much as 9 % relative to state-of-the-art approaches.
Ranking, Boosting, and Model Adaptation
, 2008
"... We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empirically optimal for a widely used information retrieval measure. The algorithm is based on boosted regression trees, although the ideas ..."
Abstract
- Add to MetaCart
We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empirically optimal for a widely used information retrieval measure. The algorithm is based on boosted regression trees, although the ideas apply to any weak learners, and it is significantly faster in both train and test phases than the state of the art, for comparable accuracy. We also show how to find the optimal linear combination for any two rankers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that starting with a previously trained model, and boosting using its residuals, furnishes an effective technique for model adaptation, and we give results for a particularly pressing problem in Web Search- training rankers for markets for which only small amounts of labeled data are available, given a ranker trained on much more data from a larger market. 1
NEW LEARNING FRAMEWORKS FOR INFORMATION RETRIEVAL
, 2011
"... Recent advances in machine learning have enabled the training of increasingly complex information retrieval models. This dissertation proposes principled approaches to formalize the learning problems for information retrieval, with an eye towards developing a unified learning framework. This will co ..."
Abstract
- Add to MetaCart
Recent advances in machine learning have enabled the training of increasingly complex information retrieval models. This dissertation proposes principled approaches to formalize the learning problems for information retrieval, with an eye towards developing a unified learning framework. This will conceptually simplify the overall development process, making it easier to reason about higher level goals and properties of the retrieval system. This dissertation advocates two complementary approaches, structured prediction and interactive learning, to learn feature-rich retrieval models that can perform well in practice.
Predicting Query Performance on the Web
"... Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However, accurate and effective performance prediction on the Web is a challenge. In particular, measures such as Clarity, that work well on homogeneous TREC like collec ..."
Abstract
- Add to MetaCart
Predicting performance of queries has many useful applications like automatic query reformulation and automatic spell correction. However, accurate and effective performance prediction on the Web is a challenge. In particular, measures such as Clarity, that work well on homogeneous TREC like collections are not as effective on the Web. In this paper, we develop an effective and efficient approach for online performance prediction on the Web. We propose use of retrieval scores, and aggregates of the rank-time features used by the document-ranking algorithm to train regressors for query performance prediction. For a set of more than 12,000 queries sampled from the query logs of a major search engine, our approach achieves a linear correlation of 0.78 with DCG, and 0.52 with NDCG. Analysis of the prediction effectiveness shows that (i) hard queries are easier to identify while easy queries are harder to identify, (ii) NDCG, a non-linear effectiveness measure, is much harder to predict than DCG, and (iii) long queries’ performance prediction is easier than prediction for short queries.

