Results 1 - 10
of
15
A Threshold of ln n for Approximating Set Cover
- JOURNAL OF THE ACM
, 1998
"... Given a collection F of subsets of S = f1; : : : ; ng, set cover is the problem of selecting as few as possible subsets from F such that their union covers S, and max k-cover is the problem of selecting k subsets from F such that their union has maximum cardinality. Both these problems are NP-har ..."
Abstract
-
Cited by 518 (6 self)
- Add to MetaCart
Given a collection F of subsets of S = f1; : : : ; ng, set cover is the problem of selecting as few as possible subsets from F such that their union covers S, and max k-cover is the problem of selecting k subsets from F such that their union has maximum cardinality. Both these problems are NP-hard. We prove that (1 \Gamma o(1)) ln n is a threshold below which set cover cannot be approximated efficiently, unless NP has slightly superpolynomial time algorithms. This closes the gap (up to low order terms) between the ratio of approximation achievable by the greedy algorithm (which is (1 \Gamma o(1)) ln n), and previous results of Lund and Yannakakis, that showed hardness of approximation within a ratio of (log 2 n)=2 ' 0:72 lnn. For max k-cover we show an approximation threshold of (1 \Gamma 1=e) (up to low order terms), under the assumption that P != NP .
The Capacitated K-Center Problem
- In Proceedings of the 4th Annual European Symposium on Algorithms, Lecture Notes in Computer Science 1136
, 1996
"... The capacitated K-center problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. Moreover, each facility may be assign ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
The capacitated K-center problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. Moreover, each facility may be assigned at most L vertices. This problem is known to be NP-hard. We give polynomial time approximation algorithms for two different versions of this problem that achieve approximation factors of 5 and 6. We also study some generalizations of this problem. 1. Introduction The basic K-center problem is a fundamental facility location problem [17] and is defined as follows: given an edge-weighted graph G = (V; E) find a subset S ` V of size at most K such that each vertex in V is "close" to some vertex in S. More formally, the objective function is defined as follows: min S`V max u2V min v2S d(u; v) where d is the distance function. For example, one may wish to install K fire stations and mi...
Computing Near-Optimal Solutions to Combinatorial Optimization Problems
- IN COMBINATORIAL OPTIMIZATION, DIMACS SERIES IN DISCRETE MATHEMATICS AND THEORETICAL COMPUTER SCIENCE
, 1995
"... In the past few years, there has been significant progress in our understanding of the extent to which near-optimal solutions can be efficiently computed for NP-hard combinatorial optimization problems. This paper surveys these recent developments, while concentrating on the advances made in the ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
In the past few years, there has been significant progress in our understanding of the extent to which near-optimal solutions can be efficiently computed for NP-hard combinatorial optimization problems. This paper surveys these recent developments, while concentrating on the advances made in the design and analysis of approximation algorithms, and in particular, on those results that rely on linear programming and its generalizations.
Conquering the divide: Continuous clustering of distributed data streams
- In Intl. Conf. on Data Engineering
, 2007
"... Data is often collected over a distributed network, but in many cases, is so voluminous that it is impractical and undesirable to collect it in a central location. Instead, we must perform distributed computations over the data, guaranteeing high quality answers even as new data arrives. In this pap ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Data is often collected over a distributed network, but in many cases, is so voluminous that it is impractical and undesirable to collect it in a central location. Instead, we must perform distributed computations over the data, guaranteeing high quality answers even as new data arrives. In this paper, we formalize and study the problem of maintaining a clustering of such distributed data that is continuously evolving. In particular, our goal is to minimize the communication and computational cost, still providing guaranteed accuracy of the clustering. We focus on the k-center clustering, and provide a suite of algorithms that vary based on which centralized algorithm they derive from, and whether they maintain a single global clustering or many local clusterings that can be merged together. We show that these algorithms can be designed to give accuracy guarantees that are close to the best possible even in the centralized case. In our experiments, we see clear trends among these algorithms, showing that the choice of algorithm is crucial, and that we can achieve a clustering that is as good as the best centralized clustering, with only a small fraction of the communication required to collect all the data in a single location. 1
Asymmetric k-Center is log* n-hard to Approximate
- J. ACM
, 2003
"... We show that the asymmetric k-center problem is 34 n)-hard to approximate unless NP DTIME(n ). Since an O(log n)-approximation algorithm is known for this problem, this essentially resolves the approximability of this problem. This is the rst natural problem whose approximability thr ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
We show that the asymmetric k-center problem is 34 n)-hard to approximate unless NP DTIME(n ). Since an O(log n)-approximation algorithm is known for this problem, this essentially resolves the approximability of this problem. This is the rst natural problem whose approximability threshold does not polynomially relate to the known approximation classes. Our techniques also resolve the approximability threshold of the weighted metric k-center problem. We show that it is hard to approximate to within a factor of 3 for any > 0.
Distance-based Representative Skyline
"... Abstract — Given an integer k, arepresentative skyline contains the k skyline points that best describe the tradeoffs among different dimensions offered by the full skyline. Although this topic has been previously studied, the existing solution may sometimes produce k points that appear in an arbitr ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Abstract — Given an integer k, arepresentative skyline contains the k skyline points that best describe the tradeoffs among different dimensions offered by the full skyline. Although this topic has been previously studied, the existing solution may sometimes produce k points that appear in an arbitrarily tiny cluster, and therefore, fail to be representative. Motivated by this, we propose a new definition of representative skyline that minimizes the distance between a non-representative skyline point and its nearest representative. We also study algorithms for computing distance-based representative skylines. In 2D space, there is a dynamic programming algorithm that guarantees the optimal solution. For dimensionality at least 3, we prove that the problem is NP-hard, and give a 2-approximate polynomial time algorithm. Using a multidimensional access method, our algorithm can directly report the representative skyline, without retrieving the full skyline. We show that our representative skyline not only better captures the contour of the entire skyline than the previous method, but also can be computed much faster. I.
Fault Tolerant K-Center Problems
, 1997
"... The basic K-center problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. This problem is known to be NP-hard, and se ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
The basic K-center problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. This problem is known to be NP-hard, and several optimal approximation algorithms that achieve a factor of 2 have been developed for it. We focus our attention on a generalization of this problem, where each vertex is required to have a set of ff (ff K) centers close to it. In particular, we study two different versions of this problem. In the first version, each vertex is required to have at least ff centers close to it. In the second version, each vertex that does not have a center placed on it is required to have at least ff centers close to it. For both these versions we are able to provide polynomial time approximation algorithms that achieve constant approximation factors for any ff. For the first version we give an algorithm ...
Facility Location with Dynamic Distance Functions
"... Facility location problems have always been studied with the assumption that the edge lengths in the network are static and do not change over time. The underlying network could be used to model a city street network for emergency facility location/hospitals, or an electronic network for locating in ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Facility location problems have always been studied with the assumption that the edge lengths in the network are static and do not change over time. The underlying network could be used to model a city street network for emergency facility location/hospitals, or an electronic network for locating information centers. In any case, it is clear that due to traffic congestion the traversal time on links changes with time. Very often, we have some estimates as to how the edge lengths change over time, and our objective is to choose a set of locations (vertices) as centers, such that at every time instant each vertex has a center close to it (clearly, the center close to a vertex may change over time). We also provide approximation algorithms as well as hardness results for the K-center problem under this model. This is the first comprehensive study regarding approximation algorithms for facility location for good time-invariant solutions. 1. Introduction Previous theoretical work on fac...
K.: Private approximation of clustering and vertex
, 2007
"... Abstract. Private approximation of search problems deals with finding approximate solutions to search problems while disclosing as little information as possible. The focus of this work is on private approximation of the vertex cover problem and two well studied clustering problems – k-center and k- ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Private approximation of search problems deals with finding approximate solutions to search problems while disclosing as little information as possible. The focus of this work is on private approximation of the vertex cover problem and two well studied clustering problems – k-center and k-median. Vertex cover was considered in [Beimel, Carmi, Nissim, and Weinreb, STOC, 2006] and we improve their infeasibility results. Clustering algorithms are frequently applied to sensitive data, and hence are of interest in the contexts of secure computation and private approximation. We show that these problems do not admit private approximations, or even approximation algorithms that leak significant number of bits. For the vertex cover problem we show a tight infeasibility result: every algorithm that ρ(n)-approximates vertex-cover must leak Ω(n/ρ(n)) bits (where n is the number of vertices in the graph). For the clustering problems we prove that even approximation algorithms with a poor approximation ratio must leak Ω(n) bits (where n is the number of points in the instance). For these results we develop new proof techniques, which are more simple and intuitive than those in Beimel et al., and yet allow stronger infeasibility results. Our proofs rely on the hardness of the promise problem where a unique optimal solution exists [Valiant and Vazirani, Theoretical Computer Science, 1986], on the hardness of approximating witnesses for NP-hard problems ([Kumar and Sivakumar,
Sensor Selection for Minimizing Worst-Case Prediction Error
"... In this paper, we study the problem of choosing the ”best ” subset of k sensors to sample from among a sensor deployment of n> k sensors, in order to predict aggregate functions over all the sensor values. The sensor data being measured are assumed to be spatially correlated, in the sense that the v ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we study the problem of choosing the ”best ” subset of k sensors to sample from among a sensor deployment of n> k sensors, in order to predict aggregate functions over all the sensor values. The sensor data being measured are assumed to be spatially correlated, in the sense that the values at two sensors can differ by at most a monotonically increasing, concave function of their distance. The goal in our work is then to select sensors so as to minimize the error, assuming that the actual values at unsampled sensors are worst-case subject to the constraints imposed by their distances from sampled sensors. Even for the mean, maximum, and minimum, the problem is NP-hard; we present approximation algorithms to select near-optimal subsets of k sensors that minimize the worstcase prediction error. In general, we show that for any aggregate function with certain concavity, symmetry and monotonicity conditions, the sensor selection problem can be modeled as a k-median clustering problem, and solved using efficient approximation algorithms designed for kmedian clustering. Our theoretical results are complemented by experiments on two real-world sensor data sets; our experiments confirm that our algorithms lead to prediction errors that are usually less than the (normalized) standard deviation of the test data, using only around 10 % of the sensors. 1

