Results 1 - 10
of
121
On Approximating Arbitrary Metrics by Tree Metrics
- In Proceedings of the 30th Annual ACM Symposium on Theory of Computing
, 1998
"... This paper is concerned with probabilistic approximation of metric spaces. In previous work we introduced the method of ecient approximation of metrics by more simple families of metrics in a probabilistic fashion. In particular we study probabilistic approximations of arbitrary metric spaces by \hi ..."
Abstract
-
Cited by 222 (13 self)
- Add to MetaCart
This paper is concerned with probabilistic approximation of metric spaces. In previous work we introduced the method of ecient approximation of metrics by more simple families of metrics in a probabilistic fashion. In particular we study probabilistic approximations of arbitrary metric spaces by \hierarchically wellseparated tree" metric spaces. This has proved as a useful technique for simplifying the solutions to various problems.
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems
- In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science
, 1999
"... We present improved combinatorial approximation algorithms for the uncapacitated facility location and k-median problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 ..."
Abstract
-
Cited by 187 (12 self)
- Add to MetaCart
We present improved combinatorial approximation algorithms for the uncapacitated facility location and k-median problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 + in ~ O(n 2 =) time. This also yields a bicriteria approximation tradeoff of (1 +; 1+ 2=) for facility cost versus service cost which is better than previously known tradeoffs and close to the best possible. Combining greedy improvement and cost scaling with a recent primal dual algorithm for facility location due to Jain and Vazirani, we get an approximation ratio of 1.853 in ~ O(n 3 ) time. This is already very close to the approximation guarantee of the best known algorithm which is LP-based. Further, combined with the best known LP-based algorithm for facility location, we get a very slight improvement in the approximation factor for facility location, achieving 1.728....
A constant-factor approximation algorithm for the k-median problem
- In Proceedings of the 31st Annual ACM Symposium on Theory of Computing
, 1999
"... We present the first constant-factor approximation algorithm for the metric k-median problem. The k-median problem is one of the most well-studied clustering problems, i.e., those problems in which the aim is to partition a given set of points into clusters so that the points within a cluster are re ..."
Abstract
-
Cited by 168 (12 self)
- Add to MetaCart
We present the first constant-factor approximation algorithm for the metric k-median problem. The k-median problem is one of the most well-studied clustering problems, i.e., those problems in which the aim is to partition a given set of points into clusters so that the points within a cluster are relatively close with respect to some measure. For the metric k-median problem, we are given n points in a metric space. We select k of these to be cluster centers, and then assign each point to its closest selected center. If point j is assigned to a center i, the cost incurred is proportional to the distance between i and j. The goal is to select the k centers that minimize the sum of the assignment costs. We give a 6 2 3-approximation algorithm for this problem. This improves upon the best previously known result of O(log k log log k), which was obtained by refining and derandomizing a randomized O(log n log log n)-approximation algorithm of Bartal. 1
Incremental Clustering and Dynamic Information Retrieval
, 1997
"... Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retri ..."
Abstract
-
Cited by 129 (3 self)
- Add to MetaCart
Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retrieval application, and which should also be useful in other applications. The goal is to efficiently maintain clusters of small diameter as new points are inserted. We analyze several natural greedy algorithms and demonstrate that they perform poorly. We propose new deterministic and randomized incremental clustering algorithms which have a provably good performance. We complement our positive results with lower bounds on the performance of incremental algorithms. Finally, we consider the dual clustering problem where the clusters are of fixed diameter, and the goal is to minimize the number of clusters. 1 Introduction We consider the following problem: as a sequence of points from a metric...
The Cache Location Problem
- IEEE/ACM Transactions on Networking
"... This paper studies the problem of where to place network caches. Emphasis is given to caches that are transparent to the clients since they are easier to manage and they require no cooperation from the clients. Our goal is to minimize the overall flow or the average delay by placing a given number o ..."
Abstract
-
Cited by 102 (6 self)
- Add to MetaCart
This paper studies the problem of where to place network caches. Emphasis is given to caches that are transparent to the clients since they are easier to manage and they require no cooperation from the clients. Our goal is to minimize the overall flow or the average delay by placing a given number of caches in the network.
ON THE COMPLEXITY OF SOME COMMON GEOMETRIC LOCATION PROBLEMS
- SIAM J. COMPUTING
, 1984
"... Given n demand points in the plane, the p-center problem is to find p supply points (anywhere in the plane) so as to minimize the maximum distance from a demo & point to its respective nearest supply point. The p-median problem is to minimize the sum of distances from demand points to their respecti ..."
Abstract
-
Cited by 96 (1 self)
- Add to MetaCart
Given n demand points in the plane, the p-center problem is to find p supply points (anywhere in the plane) so as to minimize the maximum distance from a demo & point to its respective nearest supply point. The p-median problem is to minimize the sum of distances from demand points to their respective nearest supply points. We prove that the p-center and the p-media problems relative to both the Euclidean and the rectilinear metrics are NP-hard. In fact, we prove that it is NP-hard even to approximate the p-center problems sufficiently closely. The reductions are from 3-satisfiability.
Clustering data streams: Theory and practice
- IEEE TKDE
, 2003
"... Abstract—The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little ..."
Abstract
-
Cited by 75 (2 self)
- Add to MetaCart
Abstract—The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little memory, is crucial. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm’s performance on synthetic and real data streams. Index Terms—Clustering, data streams, approximation algorithms. 1
Variable neighborhood search: Principles and applications
, 2001
"... Systematic change of neighborhood within a possibly randomized local search algorithm yields a simple and effective metaheuristic for combinatorial and global optimization, called variable neighborhood search (VNS). We present a basic scheme for this purpose, which can easily be implemented using an ..."
Abstract
-
Cited by 65 (8 self)
- Add to MetaCart
Systematic change of neighborhood within a possibly randomized local search algorithm yields a simple and effective metaheuristic for combinatorial and global optimization, called variable neighborhood search (VNS). We present a basic scheme for this purpose, which can easily be implemented using any local search algorithm as a subroutine. Its effectiveness is illustrated by solving several classical combinatorial or global optimization problems. Moreover, several extensions are proposed for solving large problem instances: using VNS within the successive approximation method yields a two-level VNS, called variable neighborhood decomposition search (VNDS); modifying the basic scheme to explore easily valleys far from the incumbent solution yields an efficient skewed VNS (SVNS) heuristic. Finally, we show how to stabilize column generation algorithms with help of VNS and discuss various ways to use VNS in graph theory, i.e., to suggest, disprove or give hints on how to prove conjectures, an area where metaheuristics do not appear
Better Streaming Algorithms for Clustering Problems
- In Proc. of 35th ACM Symposium on Theory of Computing (STOC
, 2003
"... We study cluster ng pr blems in the str aming model, wher e the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of storSD space.Our mainr esult is a r ndomized algor ithm for k--Median prE lem which p duces a constant factor a ..."
Abstract
-
Cited by 63 (1 self)
- Add to MetaCart
We study cluster ng pr blems in the str aming model, wher e the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of storSD space.Our mainr esult is a r ndomized algor ithm for k--Median prE lem which p duces a constant factor appr oximation in one pass using storR4 space O(kpolylog n). This is a significant imp r vement of the prS ious best algor5 hm which yielded a 2 appr ximation using O(n )space.
Rounding via Trees: Deterministic Approximation Algorithms for Group Steiner Trees and k-median
- In Proceedings of the 30th Annual ACM Symposium on Theory of Computing
, 1998
"... Most optimization problems on an undirected graph reduce in complexity when restricted to instances on a tree. A recent result [3] for probabilistically approximating graph metrics by trees such that no edge stretches (in an expected sense) by more than a factor of O(log 2 n) has resulted in several ..."
Abstract
-
Cited by 53 (7 self)
- Add to MetaCart
Most optimization problems on an undirected graph reduce in complexity when restricted to instances on a tree. A recent result [3] for probabilistically approximating graph metrics by trees such that no edge stretches (in an expected sense) by more than a factor of O(log 2 n) has resulted in several approximation algorithms which exploit the ease of solving problems on trees. The tree construction in [3] is inherently randomized and a natural question to ask is whether approximation algorithms which use this construction can be derandomized. We present a general framework for derandomizing approximation algorithms which use the above tree construction as a primitive. Let \Pi be a graph optimization problem which can be expressed as an integer program with 0-1 variables ¯ x(e) for each edge and with an objective function expressible as...

