Results 11  20
of
78
Directly optimizing evaluation measures in learning to rank
 In SIGIR ’08: Proceedings of the 31th annual international ACM SIGIR conference on Research and development in information retrieval
, 2008
"... One of the central issues in learning to rank for information retrieval is to develop algorithms that construct ranking models by directly optimizing evaluation measures used in information retrieval such as Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG). Several such ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
One of the central issues in learning to rank for information retrieval is to develop algorithms that construct ranking models by directly optimizing evaluation measures used in information retrieval such as Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG). Several such algorithms including SVM map and AdaRank have been proposed and their effectiveness has been verified. However, the relationships between the algorithms are not clear, and furthermore no comparisons have been conducted between them. In this paper, we conduct a study on the approach of directly optimizing evaluation measures in learning to rank for Information Retrieval (IR). We focus on the methods that minimize loss functions upper bounding the basic loss function defined on the IR measures. We first provide a general framework for the study and analyze the existing algorithms of SVM map and AdaRank within the framework. The framework is based on upper bound analysis and two types of upper bounds are discussed. Moreover, we show that we can derive new algorithms on the basis of this analysis and create one example algorithm called PermuRank. We have also conducted comparisons between SVM map, AdaRank, PermuRank, and conventional methods of Ranking SVM and RankBoost, using benchmark datasets. Experimental results show that the methods based on direct optimization of evaluation measures can always outperform conventional methods of Ranking SVM and RankBoost. However, no significant difference exists among the performances of the direct optimization methods themselves.
Early Exit Optimizations for Additive Machine Learned Ranking Systems
"... Some commercial web search engines rely on sophisticated machine learning systems for ranking web documents. Due to very large collection sizes and tight constraints on query response times, online efficiency of these learning systems forms a bottleneck. An important problem in such systems is to sp ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
Some commercial web search engines rely on sophisticated machine learning systems for ranking web documents. Due to very large collection sizes and tight constraints on query response times, online efficiency of these learning systems forms a bottleneck. An important problem in such systems is to speedup the ranking process without sacrificing much from the quality of results. In this paper, we propose optimization strategies that allow shortcircuiting score computations in additive learning systems. The strategies are evaluated over a stateoftheart machine learning system and a large, reallife query log, obtained from Yahoo!. By the proposed strategies, we are able to speedup the score computations by more than four times with almost no loss in result quality.
Ranking with Ordered Weighted Pairwise Classification
"... In ranking with the pairwise classification approach, the loss associated to a predicted ranked list is the mean of the pairwise classification losses. This loss is inadequate for tasks like information retrieval where we prefer ranked lists with high precision on the top of the list. We propose to ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
In ranking with the pairwise classification approach, the loss associated to a predicted ranked list is the mean of the pairwise classification losses. This loss is inadequate for tasks like information retrieval where we prefer ranked lists with high precision on the top of the list. We propose to optimize a larger class of loss functions for ranking, based on an ordered weighted average (OWA) (Yager, 1988) of the classification losses. Convex OWA aggregation operators range from the max to the mean depending on their weights, and can be used to focus on the top ranked elements as they give more weight to the largest losses. When aggregating hinge losses, the optimization problem is similar to the SVM for interdependent output spaces. Moreover, we show that OWA aggregates of marginbased classification losses have good generalization properties. Experiments on the Letor 3.0 benchmark dataset for information retrieval validate our approach. 1.
Trada: tree based ranking function adaptation
 In CIKM’08
, 2008
"... Machine Learned Ranking approaches have shown successes in web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, i.e., the problem of insufficient training data, which has significantly limited the fast ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Machine Learned Ranking approaches have shown successes in web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, i.e., the problem of insufficient training data, which has significantly limited the fast development and deployment of machine learned ranking functions for different web search domains. In this paper, we propose a new approach called tree based ranking function adaptation (“tree adaptation”) to address this problem. Tree adaptation assumes that ranking functions are trained with regressiontree based modeling methods, such as Gradient Boosting Trees. It takes such a ranking function from one domain and tunes its treebased structure with a small amount of training data from the target domain. The unique features include (1) it can automatically identify the part of model that needs adjustment for the new domain, (2) it can appropriately weight training examples considering both local and global distributions. Experiments are performed to show that tree adaptation can provide betterquality ranking functions for a new domain, compared to other modeling methods.
ABSTRACT Ranking Refinement and Its Application to Information Retrieval
"... We consider the problem of ranking refinement, i.e., to improve the accuracy of an existing ranking function with a small set of labeled instances. We are, particularly, interested in learning a better ranking function using two complementary sources of information, ranking information given by the ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
We consider the problem of ranking refinement, i.e., to improve the accuracy of an existing ranking function with a small set of labeled instances. We are, particularly, interested in learning a better ranking function using two complementary sources of information, ranking information given by the existing ranking function (i.e., the base ranker) and that obtained from users ’ feedbacks. This problem is very important in information retrieval where feedbacks are gradually collected. The key challenge in combining the two sources of information arises from the fact that the ranking information presented by the base ranker tends to be imperfect and the ranking information obtained from users’ feedbacks tends to be noisy. We present a novel boosting algorithm for ranking refinement that can effectively leverage the uses of the two sources of information. Our empirical study shows that the proposed algorithm is effective for ranking refinement, and furthermore it significantly outperforms the baseline algorithms that incorporate the outputs from the base ranker as an additional feature.
Learning to Rank by Optimizing NDCG Measure
"... Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precis ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precision (MAP) [2]. Until recently, most learning to rank algorithms were not using a loss function related to the above mentioned evaluation measures. The main difficulty in direct optimization of these measures is that they depend on the ranks of documents, not the numerical values output by the ranking function. We propose a probabilistic framework that addresses this challenge by optimizing the expectation of NDCG over all the possible permutations of documents. A relaxation strategy is used to approximate the average of NDCG over the space of permutation, and a bound optimization approach is proposed to make the computation efficient. Extensive experiments show that the proposed algorithm outperforms stateoftheart ranking algorithms on several benchmark data sets. 1
Large scale learning to rank
 In NIPS 2009 Workshop on Advances in Ranking
, 2009
"... Pairwise learning to rank methods such as RankSVM give good performance, but suffer from the computational burden of optimizing an objective defined over O(n 2) possible pairs for data sets with n examples. In this paper, we remove this superlinear dependence on training set size by sampling pairs ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
Pairwise learning to rank methods such as RankSVM give good performance, but suffer from the computational burden of optimizing an objective defined over O(n 2) possible pairs for data sets with n examples. In this paper, we remove this superlinear dependence on training set size by sampling pairs from an implicit pairwise expansion and applying efficient stochastic gradient descent learners for approximate SVMs. Results show ordersofmagnitude reduction in training time with no observable loss in ranking performance. Source code is freely available at:
On Using Simultaneous Perturbation Stochastic Approximation for Learning to Rank, and the Empirical Optimality of LambdaRank
"... One shortfall of existing machine learning (ML) methods when applied to information retrieval (IR) is the inability to directly optimize for typical IR performance measures. This is in part due to the discrete nature, and thus nondifferentiability, of these measures. When cast as an optimization pr ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
One shortfall of existing machine learning (ML) methods when applied to information retrieval (IR) is the inability to directly optimize for typical IR performance measures. This is in part due to the discrete nature, and thus nondifferentiability, of these measures. When cast as an optimization problem, many methods require computing the gradient. In this paper, we explore conditions where the gradient might be numerically estimated. We use Simultaneous Perturbation Stochastic Approximation as our gradient approximation method. We also examine the empirical optimality of LambdaRank, which has performed very well in practice. 1
Learning to Rank for Information Retrieval Using Genetic Programming
"... One central problem of information retrieval (IR) is to determine which documents are relevant and which are not to the user information need. This problem is practically handled by a ranking function which defines an ordering among documents according to their degree of relevance to the user query. ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
One central problem of information retrieval (IR) is to determine which documents are relevant and which are not to the user information need. This problem is practically handled by a ranking function which defines an ordering among documents according to their degree of relevance to the user query. This paper discusses work on using machine learning to automatically generate an effective ranking function for IR. This task is referred to as “learning to rank for IR ” in the field. In this paper, a learning method, RankGP, is presented to address this task. RankGP employs genetic programming to learn a ranking function by combining various types of evidences in IR, including content features, structure features, and queryindependent features. The proposed method is evaluated using the LETOR benchmark datasets and found to be competitive with Ranking SVM and RankBoost.
SemiSupervised Ensemble Ranking
"... Ranking plays a central role in many Web search and information retrieval applications. Ensemble ranking, sometimes called metasearch, aims to improve the retrieval performance by combining the outputs from multiple ranking algorithms. Many ensemble ranking approaches employ supervised learning tec ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Ranking plays a central role in many Web search and information retrieval applications. Ensemble ranking, sometimes called metasearch, aims to improve the retrieval performance by combining the outputs from multiple ranking algorithms. Many ensemble ranking approaches employ supervised learning techniques to learn appropriate weights for combining multiple rankers. The main shortcoming with these approaches is that the learned weights for ranking algorithms are query independent. This is suboptimal since a ranking algorithm could perform well for certain queries but poorly for others. In this paper, we propose a novel semisupervised ensemble ranking (SSER) algorithm that learns querydependent weights when combining multiple rankers in document retrieval. The proposed SSER algorithm is formulated as an SVMlike quadratic program (QP), and therefore can be solved efficiently by taking advantage of optimization techniques that were widely used in existing SVM solvers. We evaluated the proposed technique on a standard document retrieval testbed and observed encouraging results by comparing to a number of stateoftheart techniques.