Results 11  20
of
91
Early Exit Optimizations for Additive Machine Learned Ranking Systems
"... Some commercial web search engines rely on sophisticated machine learning systems for ranking web documents. Due to very large collection sizes and tight constraints on query response times, online efficiency of these learning systems forms a bottleneck. An important problem in such systems is to sp ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
Some commercial web search engines rely on sophisticated machine learning systems for ranking web documents. Due to very large collection sizes and tight constraints on query response times, online efficiency of these learning systems forms a bottleneck. An important problem in such systems is to speedup the ranking process without sacrificing much from the quality of results. In this paper, we propose optimization strategies that allow shortcircuiting score computations in additive learning systems. The strategies are evaluated over a stateoftheart machine learning system and a large, reallife query log, obtained from Yahoo!. By the proposed strategies, we are able to speedup the score computations by more than four times with almost no loss in result quality.
Ranking with Ordered Weighted Pairwise Classification
"... In ranking with the pairwise classification approach, the loss associated to a predicted ranked list is the mean of the pairwise classification losses. This loss is inadequate for tasks like information retrieval where we prefer ranked lists with high precision on the top of the list. We propose to ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
In ranking with the pairwise classification approach, the loss associated to a predicted ranked list is the mean of the pairwise classification losses. This loss is inadequate for tasks like information retrieval where we prefer ranked lists with high precision on the top of the list. We propose to optimize a larger class of loss functions for ranking, based on an ordered weighted average (OWA) (Yager, 1988) of the classification losses. Convex OWA aggregation operators range from the max to the mean depending on their weights, and can be used to focus on the top ranked elements as they give more weight to the largest losses. When aggregating hinge losses, the optimization problem is similar to the SVM for interdependent output spaces. Moreover, we show that OWA aggregates of marginbased classification losses have good generalization properties. Experiments on the Letor 3.0 benchmark dataset for information retrieval validate our approach. 1.
RANKING AND EMPIRICAL MINIMIZATION OF USTATISTICS
"... The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one is “better, ” with minimum ranking risk. Since the natural estimates of the risk are of the form of a Ustatistic, results of the theory of Uprocesses are required for investigating the consistency of empirical risk minimizers. We establish, in particular, a tail inequality for degenerate Uprocesses, and apply it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification. Convex risk minimization methods are also studied. 1. Introduction. Motivated
ABSTRACT Ranking Refinement and Its Application to Information Retrieval
"... We consider the problem of ranking refinement, i.e., to improve the accuracy of an existing ranking function with a small set of labeled instances. We are, particularly, interested in learning a better ranking function using two complementary sources of information, ranking information given by the ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
We consider the problem of ranking refinement, i.e., to improve the accuracy of an existing ranking function with a small set of labeled instances. We are, particularly, interested in learning a better ranking function using two complementary sources of information, ranking information given by the existing ranking function (i.e., the base ranker) and that obtained from users ’ feedbacks. This problem is very important in information retrieval where feedbacks are gradually collected. The key challenge in combining the two sources of information arises from the fact that the ranking information presented by the base ranker tends to be imperfect and the ranking information obtained from users’ feedbacks tends to be noisy. We present a novel boosting algorithm for ranking refinement that can effectively leverage the uses of the two sources of information. Our empirical study shows that the proposed algorithm is effective for ranking refinement, and furthermore it significantly outperforms the baseline algorithms that incorporate the outputs from the base ranker as an additional feature.
Learning to Rank by Optimizing NDCG Measure
"... Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precis ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels. The ranking algorithms are often evaluated using information retrieval measures, such as Normalized Discounted Cumulative Gain (NDCG) [1] and Mean Average Precision (MAP) [2]. Until recently, most learning to rank algorithms were not using a loss function related to the above mentioned evaluation measures. The main difficulty in direct optimization of these measures is that they depend on the ranks of documents, not the numerical values output by the ranking function. We propose a probabilistic framework that addresses this challenge by optimizing the expectation of NDCG over all the possible permutations of documents. A relaxation strategy is used to approximate the average of NDCG over the space of permutation, and a bound optimization approach is proposed to make the computation efficient. Extensive experiments show that the proposed algorithm outperforms stateoftheart ranking algorithms on several benchmark data sets. 1
Trada: tree based ranking function adaptation
 In CIKM’08
, 2008
"... Machine Learned Ranking approaches have shown successes in web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, i.e., the problem of insufficient training data, which has significantly limited the fast ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Machine Learned Ranking approaches have shown successes in web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, i.e., the problem of insufficient training data, which has significantly limited the fast development and deployment of machine learned ranking functions for different web search domains. In this paper, we propose a new approach called tree based ranking function adaptation (“tree adaptation”) to address this problem. Tree adaptation assumes that ranking functions are trained with regressiontree based modeling methods, such as Gradient Boosting Trees. It takes such a ranking function from one domain and tunes its treebased structure with a small amount of training data from the target domain. The unique features include (1) it can automatically identify the part of model that needs adjustment for the new domain, (2) it can appropriately weight training examples considering both local and global distributions. Experiments are performed to show that tree adaptation can provide betterquality ranking functions for a new domain, compared to other modeling methods.
Large scale learning to rank
 In NIPS 2009 Workshop on Advances in Ranking
, 2009
"... Pairwise learning to rank methods such as RankSVM give good performance, but suffer from the computational burden of optimizing an objective defined over O(n 2) possible pairs for data sets with n examples. In this paper, we remove this superlinear dependence on training set size by sampling pairs ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
Pairwise learning to rank methods such as RankSVM give good performance, but suffer from the computational burden of optimizing an objective defined over O(n 2) possible pairs for data sets with n examples. In this paper, we remove this superlinear dependence on training set size by sampling pairs from an implicit pairwise expansion and applying efficient stochastic gradient descent learners for approximate SVMs. Results show ordersofmagnitude reduction in training time with no observable loss in ranking performance. Source code is freely available at:
On Using Simultaneous Perturbation Stochastic Approximation for Learning to Rank, and the Empirical Optimality of LambdaRank
"... One shortfall of existing machine learning (ML) methods when applied to information retrieval (IR) is the inability to directly optimize for typical IR performance measures. This is in part due to the discrete nature, and thus nondifferentiability, of these measures. When cast as an optimization pr ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
One shortfall of existing machine learning (ML) methods when applied to information retrieval (IR) is the inability to directly optimize for typical IR performance measures. This is in part due to the discrete nature, and thus nondifferentiability, of these measures. When cast as an optimization problem, many methods require computing the gradient. In this paper, we explore conditions where the gradient might be numerically estimated. We use Simultaneous Perturbation Stochastic Approximation as our gradient approximation method. We also examine the empirical optimality of LambdaRank, which has performed very well in practice. 1
Optimizing Estimated Loss Reduction for Active Sampling in Rank Learning
"... Learning to rank is becoming an increasingly popular research area in machine learning. The ranking problem aims to induce an ordering or preference relations among a set of instances in the input space. However, collecting labeled data is growing into a burden in many rank applications since labeli ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Learning to rank is becoming an increasingly popular research area in machine learning. The ranking problem aims to induce an ordering or preference relations among a set of instances in the input space. However, collecting labeled data is growing into a burden in many rank applications since labeling requires eliciting the relative ordering over the set of alternatives. In this paper, we propose a novel active learning framework for SVMbased and boostingbased rank learning. Our approach suggests sampling based on maximizing the estimated loss differential over unlabeled data. Experimental results on two benchmark corpora show that the proposed model substantially reduces the labeling effort, and achieves superior performance rapidly with as much as 30 % relative improvement over the marginbased sampling baseline. 1.
Learning to Rank for Information Retrieval Using Genetic Programming
"... One central problem of information retrieval (IR) is to determine which documents are relevant and which are not to the user information need. This problem is practically handled by a ranking function which defines an ordering among documents according to their degree of relevance to the user query. ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
One central problem of information retrieval (IR) is to determine which documents are relevant and which are not to the user information need. This problem is practically handled by a ranking function which defines an ordering among documents according to their degree of relevance to the user query. This paper discusses work on using machine learning to automatically generate an effective ranking function for IR. This task is referred to as “learning to rank for IR ” in the field. In this paper, a learning method, RankGP, is presented to address this task. RankGP employs genetic programming to learn a ranking function by combining various types of evidences in IR, including content features, structure features, and queryindependent features. The proposed method is evaluated using the LETOR benchmark datasets and found to be competitive with Ranking SVM and RankBoost.