Results 1  10
of
42
Iterative ranking from pairwise comparisons
 Advances in Neural Information Processing Systems 25 (NIPS
, 2012
"... The question of aggregating pairwise comparisons to obtain a global ranking over a collection of objects has been of interest for a very long time: be it ranking of online gamers (e.g. MSR’s TrueSkill system) and chess players, aggregating social opinions, or deciding which product to sell based on ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
(Show Context)
The question of aggregating pairwise comparisons to obtain a global ranking over a collection of objects has been of interest for a very long time: be it ranking of online gamers (e.g. MSR’s TrueSkill system) and chess players, aggregating social opinions, or deciding which product to sell based on transactions. In most settings, in addition to obtaining ranking, finding ‘scores ’ for each object (e.g. player’s rating) is of interest to understanding the intensity of the preferences. In this paper, we propose a novel iterative rank aggregation algorithm for discovering scores for objects from pairwise comparisons. The algorithm has a natural random walk interpretation over the graph of objects with edges present between two objects if they are compared; the scores turn out to be the stationary probability of this random walk. The algorithm is model independent. To establish the efficacy of our method, however, we consider the popular BradleyTerryLuce (BTL) model in which each object has an associated score which determines the probabilistic outcomes of pairwise comparisons between objects. We bound the finite sample error rates between the scores assumed by the BTL model and those estimated by our algorithm. This, in essence, leads to orderoptimal dependence on the number of samples required to learn the scores well by our algorithm. Indeed, the experimental evaluation shows that our (model independent) algorithm performs as well as the Maximum Likelihood Estimator of the BTL model and outperforms a recently proposed algorithm by Ammar and Shah [1]. 1
Randomized Shellsort: A simple oblivious sorting algorithm
 In Proceedings 21st ACMSIAM Symposium on Discrete Algorithms (SODA
, 2010
"... In this paper, we describe a randomized Shellsort algorithm. This algorithm is a simple, randomized, dataoblivious version of the Shellsort algorithm that always runs in O(n log n) time and succeeds in sorting any given input permutation with very high probability. Taken together, these properties ..."
Abstract

Cited by 28 (8 self)
 Add to MetaCart
(Show Context)
In this paper, we describe a randomized Shellsort algorithm. This algorithm is a simple, randomized, dataoblivious version of the Shellsort algorithm that always runs in O(n log n) time and succeeds in sorting any given input permutation with very high probability. Taken together, these properties imply applications in the design of new efficient privacypreserving computations based on the secure multiparty computation (SMC) paradigm. In addition, by a trivial conversion of this Monte Carlo algorithm to its Las Vegas equivalent, one gets the first version of Shellsort with a running time that is provably O(n log n) with very high probability. 1
Correlation Clustering with Noisy Input
"... Correlation clustering is a type of clustering that uses a basic form of input data: For every pair of data items, the input specifies whether they are similar (belonging to the same cluster) or dissimilar (belonging to different clusters). This information may be inconsistent, and the goal is to fi ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
Correlation clustering is a type of clustering that uses a basic form of input data: For every pair of data items, the input specifies whether they are similar (belonging to the same cluster) or dissimilar (belonging to different clusters). This information may be inconsistent, and the goal is to find a clustering (partition of the vertices) that disagrees with as few pieces of information as possible. Correlation clustering is APXhard for worstcase inputs. We study the following semirandom noisy model to generate the input: start from an arbitrary partition of the vertices into clusters. Then, for each pair of vertices, the similarity information is corrupted (noisy) independently with probability p. Finally, an adversary generates the input by choosing similarity/dissimilarity information arbitrarily for each corrupted pair of vertices. In this model, our algorithm produces a clustering with cost at most 1 + O(n −1/6) times the cost of the optimal clustering, as long as p ≤ 1/2 − n −1/3. Moreover, if all clusters have size at least 1 √ c1 n then we can exactly reconstruct the planted clustering. If the noise p is small, that is, p ≤ n −δ /60, then we can exactly reconstruct all clusters of the planted clustering that have size at least 3150/δ, and provide a certificate (witness) proving that those clusters are in any optimal clustering. Among other techniques, we use the natural semidefinite programming relaxation followed by an interesting rounding phase. The analysis uses SDP duality and spectral properties of random matrices.
When Do Noisy Votes Reveal the Truth?
, 2013
"... A wellstudied approach to the design of voting rules views them as maximum likelihood estimators; given votes that are seen as noisy estimates of a true ranking of the alternatives, the rule must reconstruct the most likely true ranking. We argue that this is too stringent a requirement, and instea ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
A wellstudied approach to the design of voting rules views them as maximum likelihood estimators; given votes that are seen as noisy estimates of a true ranking of the alternatives, the rule must reconstruct the most likely true ranking. We argue that this is too stringent a requirement, and instead ask: How many votes does a voting rule need to reconstruct the true ranking? We define the family of pairwisemajority consistent rules, and show that for all rules in this family the number of samples required from the Mallows noise model is logarithmic in the number of alternatives, and that no rule can do asymptotically better (while some rules like plurality do much worse). Taking a more normative point of view, we consider voting rules that surely return the true ranking as the number of samples tends to infinity (we call this property accuracy in the limit); this allows us to move to a higher level of abstraction. We study families of noise models that are parametrized by distance functions, and find voting rules that are accurate in the limit for all noise models in such general families. We characterize the distance functions that induce noise models for which pairwisemajority consistent rules are accurate in the limit, and provide a similar result for another novel family of positiondominance consistent rules. These characterizations capture three wellknown distance functions.
A maximum likelihood approach for selecting sets of alternatives
 In Proceedings of the 28th Annual Conference on Uncertainty in Artificial Intelligence (UAI
, 2012
"... We considerthe problem of selecting a subset of alternatives given noisy evaluations of the relative strength of different alternatives. We wish to select a ksubset (for a given k) that provides a maximum likelihood estimate for one of several objectives, e.g., containing the strongest alternative. ..."
Abstract

Cited by 20 (9 self)
 Add to MetaCart
(Show Context)
We considerthe problem of selecting a subset of alternatives given noisy evaluations of the relative strength of different alternatives. We wish to select a ksubset (for a given k) that provides a maximum likelihood estimate for one of several objectives, e.g., containing the strongest alternative. Although this problem is NPhard, we show that when the noise level is sufficiently high, intuitive methods provide the optimal solution. We thus generalize classical results about singling out one alternative and identifying the hidden ranking of alternatives by strength. Extensive experiments show that our methods perform well in practical settings. 1
Efficient Ranking from Pairwise Comparisons
"... The ranking of n objects based on pairwise comparisons is a core machine learning problem, arising in recommender systems, ad placement, player ranking, biological applications and others. In many practical situations the true pairwise comparisons cannot be actively measured, but a subset of all n(n ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
The ranking of n objects based on pairwise comparisons is a core machine learning problem, arising in recommender systems, ad placement, player ranking, biological applications and others. In many practical situations the true pairwise comparisons cannot be actively measured, but a subset of all n(n−1)/2 comparisons is passively and noisily observed. Optimization algorithms (e.g., the SVM) could be used to predict a ranking with fixed expected Kendall tau distance, while achieving an Ω(n) lower bound on the corresponding sample complexity. However, due to their centralized structure they are difficult to extend to online or distributed settings. In this paper we show that much simpler algorithms can match the same Ω(n) lower bound in expectation. Furthermore, if an average of O(n log(n)) binary comparisons are measured, then one algorithm recovers the true ranking in a uniform sense, while the other predicts the ranking more accurately near the top than the bottom. We discuss extensions to online and distributed ranking, with benefits over traditional alternatives. 1.
Faster Algorithms for Feedback Arc Set Tournament, Kemeny Rank Aggregation and Betweenness Tournament
, 2010
"... ..."
Fixing a tournament
 In Proceedings of AAAI’10
, 2010
"... We consider a very natural problem concerned with game manipulation. Let G be a directed graph where the nodes represent players of a game, and an edge from u to v means that u can beat v in the game. (If an edge (u, v) is not present, one cannot match u and v.) Given G and a “favorite ” node A, is ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
(Show Context)
We consider a very natural problem concerned with game manipulation. Let G be a directed graph where the nodes represent players of a game, and an edge from u to v means that u can beat v in the game. (If an edge (u, v) is not present, one cannot match u and v.) Given G and a “favorite ” node A, is it possible to set up the bracket of a balanced singleelimination tournament so that A is guaranteed to win, if matches occur as predicted by G? We show that the problem is NPcomplete for general graphs. For the case when G is a tournament graph we give several interesting conditions on the desired winner A for which there exists a balanced singleelimination tournament which A wins, and it can be found in polynomial time.
Active learning using smooth relative regret approximations with applications
, 2000
"... The disagreement coefficient of Hanneke has become a central data independent invariant in proving active learning rates. It has been shown in various ways that a concept class with low complexity together with a bound on the disagreement coefficient at an optimal solution allows active learning rat ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
The disagreement coefficient of Hanneke has become a central data independent invariant in proving active learning rates. It has been shown in various ways that a concept class with low complexity together with a bound on the disagreement coefficient at an optimal solution allows active learning rates that are superior to passive learning ones. We present a different tool for pool based active learning which follows from the existence of a certain uniform version of low disagreement coefficient, but is not equivalent to it. In fact, we present two fundamental active learning problems of significant interest for which our approach allows nontrivial active learning bounds. However, any general purpose method relying on the disagreement coefficient bounds only fails to guarantee any useful bounds for these problems. The applications of interest are: Learning to rank from pairwise preferences, and clustering with side information (a.k.a. semisupervised clustering). The tool we use is based on the learner’s ability to compute an estimator of the difference between the loss of any hypothesis and some fixed “pivotal ” hypothesis to within an absolute error of at most ε times the disagreement measure (ℓ1 distance) between the two hypotheses. We prove that such an estimator implies the existence of a learning algorithm which, at each iteration, reduces its inclass excess risk to within a constant factor. Each iteration replaces the current pivotal hypothesis with the minimizer of the estimated loss difference function with respect to the previous pivotal hypothesis. The label complexity essentially becomes that of computing this estimator.
Active Learning Ranking from Pairwise Preferences with Almost Optimal Query Complexity
"... Given a set V of n elements we wish to linearly order them using pairwise preference labels which may be nontransitive (due to irrationality or arbitrary noise). The goal is to linearly order the elements while disagreeing with as few pairwise preference labels as possible. Our performance is measu ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Given a set V of n elements we wish to linearly order them using pairwise preference labels which may be nontransitive (due to irrationality or arbitrary noise). The goal is to linearly order the elements while disagreeing with as few pairwise preference labels as possible. Our performance is measured by two parameters: The number of disagreements (loss) and the query complexity (number of pairwise preference labels). Our algorithm adaptively queries at most O(n poly(log n, ε−1)) preference labels for a regret of ε times the optimal loss. This is strictly better, and often significantly better than what nonadaptive sampling could achieve. Our main result helps settle an open problem posed by learningtorank (from pairwise information) theoreticians and practitioners: What is a provably correct way to sample preference labels? 1