Results 1  10
of
19
How to rank with few errors
, 2007
"... We present a polynomial time approximation scheme (PTAS) for the minimum feedback arc set problem on tournaments. A simple weighted generalization gives a PTAS for KemenyYoung rank aggregation. ..."
Abstract

Cited by 50 (2 self)
 Add to MetaCart
We present a polynomial time approximation scheme (PTAS) for the minimum feedback arc set problem on tournaments. A simple weighted generalization gives a PTAS for KemenyYoung rank aggregation.
Deterministic algorithms for rank aggregation and other ranking and clustering problems
 In In Proceedings of the Fifth International Workshop on Approximation and Online Algorithms
, 2007
"... Abstract. We consider ranking and clustering problems related to the aggregation of inconsistent information. Ailon, Charikar, and Newman [1] proposed randomized constant factor approximation algorithms for these problems. Together with Hegde and Jain, we recently proposed deterministic versions of ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
Abstract. We consider ranking and clustering problems related to the aggregation of inconsistent information. Ailon, Charikar, and Newman [1] proposed randomized constant factor approximation algorithms for these problems. Together with Hegde and Jain, we recently proposed deterministic versions of some of these randomized algorithms [2]. With one exception, these algorithms required the solution of a linear programming relaxation. In this paper, we introduce a purely combinatorial deterministic pivoting algorithm for weighted ranking problems with weights that satisfy the triangle inequality; our analysis is quite simple. We then shown how to use this algorithm to get the first deterministic combinatorial approximation algorithm for the partial rank aggregation problem with performance guarantee better than 2. In addition, we extend our approach to the linear programming based algorithms in Ailon et al. [1] and Ailon [3]. Finally, we show that constrained rank aggregation is not harder than unconstrained rank aggregation.
Fast FAST
"... Abstract. We present a randomized subexponential time, polynomial space parameterized algorithm for the kWeighted Feedback Arc Set in Tournaments (kFAST) problem. We also show that our algorithm can be derandomized by slightly increasing the running time. To derandomize our algorithm we construct ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Abstract. We present a randomized subexponential time, polynomial space parameterized algorithm for the kWeighted Feedback Arc Set in Tournaments (kFAST) problem. We also show that our algorithm can be derandomized by slightly increasing the running time. To derandomize our algorithm we construct a new kind of universal hash functions, that we coin universal coloring families. For integers m, k and r, a family F of functions from [m] to [r] is called a universal (m, k, r)coloring family if for any graph G on the set of vertices [m] with at most k edges, there exists an f ∈ F which is a proper vertex coloring of G. Our algorithm is the first nontrivial subexponential time parameterized algorithm outside the framework of bidimensionality. 1
Average Parameterization and Partial Kernelization for Computing Medians
 PROC. 9TH LATIN
, 2010
"... We propose an effective polynomialtime preprocessing strategy for intractable median problems. Developing a new methodological framework, we show that if the input instances of generally intractable problems exhibit a sufficiently high degree of similarity between each other on average, then there ..."
Abstract

Cited by 9 (7 self)
 Add to MetaCart
We propose an effective polynomialtime preprocessing strategy for intractable median problems. Developing a new methodological framework, we show that if the input instances of generally intractable problems exhibit a sufficiently high degree of similarity between each other on average, then there are efficient exact solving algorithms. In other words, we show that the median problems Swap Median Permutation, Consensus Clustering, Kemeny Score, and Kemeny Tie Score all are fixedparameter tractable with respect to the parameter “average distance between input objects”. To this end, we develop the new concept of “partial kernelization” and identify interesting polynomialtime solvable special cases for the considered problems.
Rank aggregation: Together we’re strong
 In Proc. of 11th ALENEX
, 1998
"... We consider the problem of finding a ranking of a set of elements that is “closest to ” a given set of input rankings of the elements; more precisely, we want to find a permutation that minimizes the Kendalltau distance to the input rankings, where the Kendalltau distance is defined as the sum ove ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We consider the problem of finding a ranking of a set of elements that is “closest to ” a given set of input rankings of the elements; more precisely, we want to find a permutation that minimizes the Kendalltau distance to the input rankings, where the Kendalltau distance is defined as the sum over all input rankings of the number of pairs of elements that are in a different order in the input ranking than in the output ranking. If the input rankings are permutations, this problem is known as the Kemeny rank aggregation problem. This problem arises for example in building metasearch engines for Web search, aggregating viewers ’ rankings of movies, or giving recommendations to a user based on several different criteria, where we can think of having one ranking of the
Correlation Clustering with Noisy Input
"... Correlation clustering is a type of clustering that uses a basic form of input data: For every pair of data items, the input specifies whether they are similar (belonging to the same cluster) or dissimilar (belonging to different clusters). This information may be inconsistent, and the goal is to fi ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Correlation clustering is a type of clustering that uses a basic form of input data: For every pair of data items, the input specifies whether they are similar (belonging to the same cluster) or dissimilar (belonging to different clusters). This information may be inconsistent, and the goal is to find a clustering (partition of the vertices) that disagrees with as few pieces of information as possible. Correlation clustering is APXhard for worstcase inputs. We study the following semirandom noisy model to generate the input: start from an arbitrary partition of the vertices into clusters. Then, for each pair of vertices, the similarity information is corrupted (noisy) independently with probability p. Finally, an adversary generates the input by choosing similarity/dissimilarity information arbitrarily for each corrupted pair of vertices. In this model, our algorithm produces a clustering with cost at most 1 + O(n −1/6) times the cost of the optimal clustering, as long as p ≤ 1/2 − n −1/3. Moreover, if all clusters have size at least 1 √ c1 n then we can exactly reconstruct the planted clustering. If the noise p is small, that is, p ≤ n −δ /60, then we can exactly reconstruct all clusters of the planted clustering that have size at least 3150/δ, and provide a certificate (witness) proving that those clusters are in any optimal clustering. Among other techniques, we use the natural semidefinite programming relaxation followed by an interesting rounding phase. The analysis uses SDP duality and spectral properties of random matrices.
Discovering bucket orders from full rankings
 In SIGMOD
, 2008
"... Discovering a bucket order B from a collection of possibly noisy full rankings is a fundamental problem that relates to various applications involving rankings. Informally, a bucket order is a total order that allows “ties ” between items in a bucket. A bucket order B can be viewed as a“representati ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Discovering a bucket order B from a collection of possibly noisy full rankings is a fundamental problem that relates to various applications involving rankings. Informally, a bucket order is a total order that allows “ties ” between items in a bucket. A bucket order B can be viewed as a“representative” that summarizes a given set of full rankings {T1,T2,...,Tm}, or conversely B can be an “approximation ” of some “ground truth ” G where the rankings {T1,T2,...,Tm} are the “linear extensions ” of G. Current work of finding bucket orders such as the dynamic programming algorithm is mainly developed from the“representative” perspective, which maximizes items ’ intrabucket similarity when forming a bucket. The underlying idea of maximizing intrabucket similarity is realized via minimizing the sum of the deviations of median ranks within a bucket. In contrast, from the “approximation ” perspective, since each observed full ranking Ti is simply a linear extension of the given “ground truth ” bucket order G, items in a big bucket b in G are forced to have different median ranks, and as a result b will have a big sum of deviations. Thus, minimizing the sum of deviations may result in an undesirable scenario that big buckets are mostly decomposed into small ones. In this paper, we propose a novel heuristic called Abnormal Rank Gap to capture the interbucket dissimilarity for better bucket forming. In addition, we propose to use the “closeness ” on multiple quantile ranks to determine if two items should be put into the same bucket. We develop a novel bucket order discovering method called the Bucket Gap algorithm. Our extensive experiments demonstrate that the Bucket Gap algorithm significantly outperforms the major related work, i.e., the Bucket Pivot algorithm. In particular, the error distance of the generated bucket order can be reduced by about 30 % on a real paleontological dataset and This work was done while Jianlin Feng was visiting University
Editing Graphs into Disjoint Unions of Dense Clusters
"... Abstract. In the ΠCluster Editing problem, one is given an undirected graph G, a density measure Π, and an integer k ≥ 0, and needs to decide whether it is possible to transform G by editing (deleting and inserting) at most k edges into a dense cluster graph. Herein, a dense cluster graph is a grap ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Abstract. In the ΠCluster Editing problem, one is given an undirected graph G, a density measure Π, and an integer k ≥ 0, and needs to decide whether it is possible to transform G by editing (deleting and inserting) at most k edges into a dense cluster graph. Herein, a dense cluster graph is a graph in which every connected component K = (VK, EK) satisfies Π. The wellstudied Cluster Editing problem is a special case of this problem with Π:=“being a clique”. In this work, we consider three other density measures that generalize cliques: 1) having at most s missing edges (sdefective cliques), 2) having average degree at least VK  − s (averagesplexes), and 3) having average degree at least µ · (VK  − 1) (µcliques), where s and µ are a fixed integer and a fixed rational number, respectively. We first show that the ΠCluster Editing problem is NPcomplete for all three density measures. Then, we study the fixedparameter tractability of the three clustering problems, showing that the first two problems are fixedparameter tractable with respect to the parameter (s, k) and that the third problem is W[1]hard with respect to the parameter k for 0 < µ < 1. 1
Improved Algorithms for Bicluster Editing
"... Abstract. The NPhard Bicluster Editing is to add or remove at most k edges to make a bipartite graph G = (V, E) a vertexdisjoint union of complete bipartite subgraphs. It has applications in the analysis of gene expression data. We show that by polynomialtime preprocessing, one can shrink a probl ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. The NPhard Bicluster Editing is to add or remove at most k edges to make a bipartite graph G = (V, E) a vertexdisjoint union of complete bipartite subgraphs. It has applications in the analysis of gene expression data. We show that by polynomialtime preprocessing, one can shrink a problem instance to one with 4k vertices, thus proving that the problem has a linear kernel, improving a quadratic kernel result. We further give a search tree algorithm that improves the running time bound from the trivial O(4 k + E) to O(3.24 k + E). Finally, we give a randomized 4approximation, improving a known approximation with factor 11. 1
Kernels for Feedback Arc Set In Tournaments
, 2009
"... A tournament T = (V, A) is a directed graph in which there is exactly one arc between every pair of distinct vertices. Given a digraph on n vertices and an integer parameter k, the Feedback Arc Set problem asks whether the given digraph has a set of k arcs whose removal results in an acyclic digraph ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A tournament T = (V, A) is a directed graph in which there is exactly one arc between every pair of distinct vertices. Given a digraph on n vertices and an integer parameter k, the Feedback Arc Set problem asks whether the given digraph has a set of k arcs whose removal results in an acyclic digraph. The Feedback Arc Set problem restricted to tournaments is known as the kFeedback Arc Set in Tournaments (kFAST) problem. In this paper we obtain a linear vertex kernel for kFAST. That is, we give a polynomial time algorithm which given an input instance T to kFAST obtains an equivalent instance T ′ on O(k) vertices. In fact, given any fixed ɛ> 0, the kernelized instance has at most (2 + ɛ)k vertices. Our result improves the previous known bound of O(k²) on the kernel size for kFAST. Our kernelization algorithm solves the problem on a subclass of tournaments in polynomial time and uses a known polynomial time approximation scheme for kFAST.