Results 1 - 10
of
20
Ranking on graph data
- In ICML
, 2006
"... In ranking, one is given examples of order relationships among objects, and the goal is to learn from these examples a real-valued ranking function that induces a ranking or ordering over the object space. We consider the problem of learning such a ranking function when the data is represented as a ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
In ranking, one is given examples of order relationships among objects, and the goal is to learn from these examples a real-valued ranking function that induces a ranking or ordering over the object space. We consider the problem of learning such a ranking function when the data is represented as a graph, in which vertices correspond to objects and edges encode similarities between objects. Building on recent developments in regularization theory for graphs and corresponding Laplacian-based methods for classification, we develop an algorithmic framework for learning ranking functions on graph data. We provide generalization guarantees for our algorithms via recent results based on the notion of algorithmic stability, and give experimental evidence of the potential benefits of our framework. 1.
Name reference resolution in organizational email archives
- In SIAM
, 2006
"... Online communications provide a rich resource for understanding social networks. Information about the actors, and their dynamic roles and relationships, can be inferred from both the communication content and traffic structure. A key component in the analysis of online communications such as email ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
Online communications provide a rich resource for understanding social networks. Information about the actors, and their dynamic roles and relationships, can be inferred from both the communication content and traffic structure. A key component in the analysis of online communications such as email is the resolution of name references within the body of the message. Name reference resolution relies on the context of the message; both the content of the message and the sender and recipients ’ relationships can help to resolve a reference. Here we investigate a variety of approaches which make use of the email traffic network to disambiguate email name references. The email traffic network serves as a proxy for inferring relationships. These relationships in turn help us infer likely candidates for the name references. Our initial findings suggest that simple temporal models can help us effectively resolve name references. For the class of models proposed, performance is maximized by exploiting long-term traffic statistics to rank candidates. 1
Estimating Class Membership Probabilities using Classifier Learners
"... We present an algorithm, "Probing", which reduces learning an estimator of class probability membership to learning binary classifiers. The reduction comes with a theoretical guarantee: a small error rate for binary classification implies accurate estimation of class membership probabilities. We tes ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
We present an algorithm, "Probing", which reduces learning an estimator of class probability membership to learning binary classifiers. The reduction comes with a theoretical guarantee: a small error rate for binary classification implies accurate estimation of class membership probabilities. We tested our reduction on several datasets with several classifier learning algorithms. The results show strong performance as compared to other common methods for obtaining class membership probability estimates from classifiers.
Stability and generalization of bipartite ranking algorithms
- Proceedings of the Eighteenth Annual Conference on Computational Learning Theory (COLT
, 2005
"... Abstract. The problem of ranking, in which the goal is to learn a real-valued ranking function that induces a ranking or ordering over an instance space, has recently gained attention in machine learning. We study generalization properties of ranking algorithms, in a particular setting of the rankin ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Abstract. The problem of ranking, in which the goal is to learn a real-valued ranking function that induces a ranking or ordering over an instance space, has recently gained attention in machine learning. We study generalization properties of ranking algorithms, in a particular setting of the ranking problem known as the bipartite ranking problem, using the notion of algorithmic stability. In particular, we derive generalization bounds for bipartite ranking algorithms that have good stability properties. We show that kernel-based ranking algorithms that perform regularization in a reproducing kernel Hilbert space have such stability properties, and therefore our bounds can be applied to these algorithms; this is in contrast with previous generalization bounds for ranking, which are based on uniform convergence and in many cases cannot be applied to these algorithms. A comparison of the bounds we obtain with corresponding bounds for classification algorithms yields some interesting insights into the difference in generalization behaviour between ranking and classification. 1
Ranking with a p-norm push
, 2005
"... Abstract. We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, we provide a general form of conve ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract. We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, we provide a general form of convex objective that gives high-scoring examples more importance. This “push ” near the top of the list can be chosen arbitrarily large or small. We choose ℓp-norms to provide a specific type of push; as p becomes large, the algorithm concentrates harder near the top of the list. We derive a generalization bound based on the p-norm objective. We then derive a corresponding boosting-style algorithm, and illustrate the usefulness of the algorithm through experiments on UCI data. We also prove that the minimizer of the objective is unique in a specific sense. 1
Margin-based Ranking and an Equivalence between AdaBoost and RankBoost
, 2009
"... We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then describe a new algorithm, smooth margin ranking, t ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then describe a new algorithm, smooth margin ranking, that precisely converges to a maximum ranking-margin solution. The algorithm is a modification of RankBoost, analogous to “approximate coordinate ascent boosting. ” Finally, we prove that AdaBoost and RankBoost are equally good for the problems of bipartite ranking and classification in terms of their asymptotic behavior on the training set. Under natural conditions, AdaBoost achieves an area under the ROC curve that is equally as good as RankBoost’s; furthermore, RankBoost, when given a specific intercept, achieves a misclassification error that is as good as AdaBoost’s. This may help to explain the empirical observations made by Cortes and Mohri, and Caruana and Niculescu-Mizil, about the excellent performance of AdaBoost as a bipartite ranking algorithm, as measured by the area under the ROC curve.
An Efficient Projection for l1, ∞ Regularization
"... In recent years the l1, ∞ norm has been proposed for joint regularization. In essence, this type of regularization aims at extending the l1 framework for learning sparse models to a setting where the goal is to learn a set of jointly sparse models. In this paper we derive a simple and effective proj ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In recent years the l1, ∞ norm has been proposed for joint regularization. In essence, this type of regularization aims at extending the l1 framework for learning sparse models to a setting where the goal is to learn a set of jointly sparse models. In this paper we derive a simple and effective projected gradient method for optimization of l1, ∞ regularized problems. The main challenge in developing such a method resides on being able to compute efficient projections to the l1, ∞ ball. We present an algorithm that works in O(nlog n) time and O(n) memory where n is the number of parameters. We test our algorithm in a multi-task image annotation problem. Our results show that l1,∞ leads to better performance than both l2 and l1 regularization and that it is is effective in discovering jointly sparse solutions. 1.
An Efficient Reduction of Ranking to Classification
, 2007
"... This paper describes an efficient reduction of the learning problem of ranking to binary classification. The reduction is randomized and guarantees a pairwise misranking regret bounded by that of the binary classifier, improving on a recent result of Balcan et al. (2007) which ensures only twice tha ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
This paper describes an efficient reduction of the learning problem of ranking to binary classification. The reduction is randomized and guarantees a pairwise misranking regret bounded by that of the binary classifier, improving on a recent result of Balcan et al. (2007) which ensures only twice that upper-bound. Moreover, our reduction applies to a broader class of ranking loss functions, admits a simple proof, and the expected time complexity of our algorithm in terms of number of calls to a classifier or preference function is also improved from Ω(n 2) to O(n log n). In addition, when the top k ranked elements only are required (k ≪ n), as in many applications in information extraction or search engine design, the time complexity of our algorithm can be further reduced to O(k log k+n). Our reduction and algorithm are thus practical for realistic applications where the number of points to rank exceeds several thousands. Much of our results also extend beyond the bipartite case previously studied. To further complement them, we also derive lower bounds for any deterministic reduction of ranking to binary classification, proving that randomization is necessary to achieve our reduction guarantees. 1
RANKING AND EMPIRICAL MINIMIZATION OF U-STATISTICS
"... The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one is “better, ” with minimum ranking risk. Since the natural estimates of the risk are of the form of a U-statistic, results of the theory of U-processes are required for investigating the consistency of empirical risk minimizers. We establish, in particular, a tail inequality for degenerate U-processes, and apply it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification. Convex risk minimization methods are also studied. 1. Introduction. Motivated
A large deviation bound for the area under the ROC curve
- In Advances in Neural Information Processing Systems 17
, 2005
"... The area under an ROC curve (AUC) has been advocated as an evaluation criterion for bipartite ranking problems. In this paper, we study large deviation properties of the AUC; in particular, we derive a distribution-free large deviation bound for the AUC which serves to bound the expected accuracy of ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The area under an ROC curve (AUC) has been advocated as an evaluation criterion for bipartite ranking problems. In this paper, we study large deviation properties of the AUC; in particular, we derive a distribution-free large deviation bound for the AUC which serves to bound the expected accuracy of a ranking function in terms of its empirical AUC on an independent test sequence. 1 A comparison of our result with a corresponding large deviation result for the classification error rate suggests that the test sample size required to obtain an ɛ-accurate estimate of the expected accuracy of a ranking function with δ-confidence is larger than that required to obtain an ɛ-accurate estimate of the expected error rate of a classification function with the same confidence. A simple application of the union bound allows the large deviation bound to be extended to learned ranking functions chosen from finite function classes. 1

