Results 1  10
of
38
A Threshold of ln n for Approximating Set Cover
 JOURNAL OF THE ACM
, 1998
"... Given a collection F of subsets of S = f1; : : : ; ng, set cover is the problem of selecting as few as possible subsets from F such that their union covers S, and max kcover is the problem of selecting k subsets from F such that their union has maximum cardinality. Both these problems are NPhar ..."
Abstract

Cited by 778 (5 self)
 Add to MetaCart
(Show Context)
Given a collection F of subsets of S = f1; : : : ; ng, set cover is the problem of selecting as few as possible subsets from F such that their union covers S, and max kcover is the problem of selecting k subsets from F such that their union has maximum cardinality. Both these problems are NPhard. We prove that (1 \Gamma o(1)) ln n is a threshold below which set cover cannot be approximated efficiently, unless NP has slightly superpolynomial time algorithms. This closes the gap (up to low order terms) between the ratio of approximation achievable by the greedy algorithm (which is (1 \Gamma o(1)) ln n), and previous results of Lund and Yannakakis, that showed hardness of approximation within a ratio of (log 2 n)=2 ' 0:72 lnn. For max kcover we show an approximation threshold of (1 \Gamma 1=e) (up to low order terms), under the assumption that P != NP .
Distancebased Representative Skyline
"... Abstract — Given an integer k, arepresentative skyline contains the k skyline points that best describe the tradeoffs among different dimensions offered by the full skyline. Although this topic has been previously studied, the existing solution may sometimes produce k points that appear in an arbitr ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
(Show Context)
Abstract — Given an integer k, arepresentative skyline contains the k skyline points that best describe the tradeoffs among different dimensions offered by the full skyline. Although this topic has been previously studied, the existing solution may sometimes produce k points that appear in an arbitrarily tiny cluster, and therefore, fail to be representative. Motivated by this, we propose a new definition of representative skyline that minimizes the distance between a nonrepresentative skyline point and its nearest representative. We also study algorithms for computing distancebased representative skylines. In 2D space, there is a dynamic programming algorithm that guarantees the optimal solution. For dimensionality at least 3, we prove that the problem is NPhard, and give a 2approximate polynomial time algorithm. Using a multidimensional access method, our algorithm can directly report the representative skyline, without retrieving the full skyline. We show that our representative skyline not only better captures the contour of the entire skyline than the previous method, but also can be computed much faster. I.
The Capacitated KCenter Problem
 In Proceedings of the 4th Annual European Symposium on Algorithms, Lecture Notes in Computer Science 1136
, 1996
"... The capacitated Kcenter problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. Moreover, each facility may be assign ..."
Abstract

Cited by 40 (6 self)
 Add to MetaCart
(Show Context)
The capacitated Kcenter problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. Moreover, each facility may be assigned at most L vertices. This problem is known to be NPhard. We give polynomial time approximation algorithms for two different versions of this problem that achieve approximation factors of 5 and 6. We also study some generalizations of this problem. 1. Introduction The basic Kcenter problem is a fundamental facility location problem [17] and is defined as follows: given an edgeweighted graph G = (V; E) find a subset S ` V of size at most K such that each vertex in V is "close" to some vertex in S. More formally, the objective function is defined as follows: min S`V max u2V min v2S d(u; v) where d is the distance function. For example, one may wish to install K fire stations and mi...
Asymmetric kcenter is log ∗ nhard to Approximate
 In Proc. SODA
, 2005
"... In the Asymmetric kCenter problem, the input is an integer k and a complete digraph over n points together with a distance function obeying the directed triangle inequality. The goal is to choose a set of k points to serve as centers and to assign all the points to the centers, so that the maximum ..."
Abstract

Cited by 34 (4 self)
 Add to MetaCart
(Show Context)
In the Asymmetric kCenter problem, the input is an integer k and a complete digraph over n points together with a distance function obeying the directed triangle inequality. The goal is to choose a set of k points to serve as centers and to assign all the points to the centers, so that the maximum distance of any point to its center is as small as possible. We show that the Asymmetric kCenter problem is hard to approximate up to a factor of log ∗ n − Θ(1) unless NP ⊆ DTIME(n log log n). Since an O(log ∗ n)approximation algorithm is known for this problem, this essentially resolves the approximability of this problem. This is the first natural problem whose approximability threshold does not polynomially relate to the known approximation classes. We also resolve the approximability threshold of the metric kCenter problem with costs.
Computing NearOptimal Solutions to Combinatorial Optimization Problems
 IN COMBINATORIAL OPTIMIZATION, DIMACS SERIES IN DISCRETE MATHEMATICS AND THEORETICAL COMPUTER SCIENCE
, 1995
"... In the past few years, there has been significant progress in our understanding of the extent to which nearoptimal solutions can be efficiently computed for NPhard combinatorial optimization problems. This paper surveys these recent developments, while concentrating on the advances made in the ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
In the past few years, there has been significant progress in our understanding of the extent to which nearoptimal solutions can be efficiently computed for NPhard combinatorial optimization problems. This paper surveys these recent developments, while concentrating on the advances made in the design and analysis of approximation algorithms, and in particular, on those results that rely on linear programming and its generalizations.
Conquering the divide: Continuous clustering of distributed data streams
 In Intl. Conf. on Data Engineering
, 2007
"... Data is often collected over a distributed network, but in many cases, is so voluminous that it is impractical and undesirable to collect it in a central location. Instead, we must perform distributed computations over the data, guaranteeing high quality answers even as new data arrives. In this pap ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
(Show Context)
Data is often collected over a distributed network, but in many cases, is so voluminous that it is impractical and undesirable to collect it in a central location. Instead, we must perform distributed computations over the data, guaranteeing high quality answers even as new data arrives. In this paper, we formalize and study the problem of maintaining a clustering of such distributed data that is continuously evolving. In particular, our goal is to minimize the communication and computational cost, still providing guaranteed accuracy of the clustering. We focus on the kcenter clustering, and provide a suite of algorithms that vary based on which centralized algorithm they derive from, and whether they maintain a single global clustering or many local clusterings that can be merged together. We show that these algorithms can be designed to give accuracy guarantees that are close to the best possible even in the centralized case. In our experiments, we see clear trends among these algorithms, showing that the choice of algorithm is crucial, and that we can achieve a clustering that is as good as the best centralized clustering, with only a small fraction of the communication required to collect all the data in a single location. 1
Fault Tolerant KCenter Problems
, 1997
"... The basic Kcenter problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. This problem is known to be NPhard, and se ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
The basic Kcenter problem is a fundamental facility location problem, where we are asked to locate K facilities in a graph, and to assign vertices to facilities, so as to minimize the maximum distance from a vertex to the facility to which it is assigned. This problem is known to be NPhard, and several optimal approximation algorithms that achieve a factor of 2 have been developed for it. We focus our attention on a generalization of this problem, where each vertex is required to have a set of ff (ff K) centers close to it. In particular, we study two different versions of this problem. In the first version, each vertex is required to have at least ff centers close to it. In the second version, each vertex that does not have a center placed on it is required to have at least ff centers close to it. For both these versions we are able to provide polynomial time approximation algorithms that achieve constant approximation factors for any ff. For the first version we give an algorithm ...
Sensor Selection for Minimizing WorstCase Prediction Error
"... In this paper, we study the problem of choosing the ”best ” subset of k sensors to sample from among a sensor deployment of n> k sensors, in order to predict aggregate functions over all the sensor values. The sensor data being measured are assumed to be spatially correlated, in the sense that th ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
In this paper, we study the problem of choosing the ”best ” subset of k sensors to sample from among a sensor deployment of n> k sensors, in order to predict aggregate functions over all the sensor values. The sensor data being measured are assumed to be spatially correlated, in the sense that the values at two sensors can differ by at most a monotonically increasing, concave function of their distance. The goal in our work is then to select sensors so as to minimize the error, assuming that the actual values at unsampled sensors are worstcase subject to the constraints imposed by their distances from sampled sensors. Even for the mean, maximum, and minimum, the problem is NPhard; we present approximation algorithms to select nearoptimal subsets of k sensors that minimize the worstcase prediction error. In general, we show that for any aggregate function with certain concavity, symmetry and monotonicity conditions, the sensor selection problem can be modeled as a kmedian clustering problem, and solved using efficient approximation algorithms designed for kmedian clustering. Our theoretical results are complemented by experiments on two realworld sensor data sets; our experiments confirm that our algorithms lead to prediction errors that are usually less than the (normalized) standard deviation of the test data, using only around 10 % of the sensors. 1
Facility Location with Dynamic Distance Functions
"... Facility location problems have always been studied with the assumption that the edge lengths in the network are static and do not change over time. The underlying network could be used to model a city street network for emergency facility location/hospitals, or an electronic network for locating in ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Facility location problems have always been studied with the assumption that the edge lengths in the network are static and do not change over time. The underlying network could be used to model a city street network for emergency facility location/hospitals, or an electronic network for locating information centers. In any case, it is clear that due to traffic congestion the traversal time on links changes with time. Very often, we have some estimates as to how the edge lengths change over time, and our objective is to choose a set of locations (vertices) as centers, such that at every time instant each vertex has a center close to it (clearly, the center close to a vertex may change over time). We also provide approximation algorithms as well as hardness results for the Kcenter problem under this model. This is the first comprehensive study regarding approximation algorithms for facility location for good timeinvariant solutions. 1. Introduction Previous theoretical work on fac...