• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and lagrangian relazation (0)

by A JAIN, V VAZIRANI
Venue:Journal of the ACM
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 370
Next 10 →

Correlation Clustering

by Nikhil Bansal, Avrim Blum, Shuchi Chawla - MACHINE LEARNING , 2002
"... We consider the following clustering problem: we have a complete graph on # vertices (items), where each edge ### ## is labeled either # or depending on whether # and # have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as mu ..."
Abstract - Cited by 332 (4 self) - Add to MetaCart
We consider the following clustering problem: we have a complete graph on # vertices (items), where each edge ### ## is labeled either # or depending on whether # and # have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as much as possible with the edge labels. That is, we want a clustering that maximizes the number of # edges within clusters, plus the number of edges between clusters (equivalently, minimizes the number of disagreements: the number of edges inside clusters plus the number of # edges between clusters). This formulation is motivated from a document clustering problem in which one has a pairwise similarity function # learned from past data, and the goal is to partition the current set of documents in a way that correlates with # as much as possible; it can also be viewed as a kind of "agnostic learning" problem. An interesting
(Show Context)

Citation Context

...interesting about the clustering problem defined here is that unlike most clustering formulations, we do not need to specify the number of clusters � as a separate parameter. For example, in �-median =-=[7, 15]-=- or min-sum clustering [20] or min-max clustering [14], one can always get a perfect score by putting each node into its own cluster — the question is how well one can do with only � clusters. In our ...

On Clusterings: Good, Bad and Spectral

by Ravi Kannan , Santosh Vempala, Adrian Vetta , 2003
"... We motivate and develop a natural bicriteria measure for assessing the quality of a clustering which avoids the drawbacks of existing measures. A simple recursive heuristic is shown to have poly-logarithmic worst-case guarantees under the new measure. The main result of the paper is the analysis of ..."
Abstract - Cited by 332 (11 self) - Add to MetaCart
We motivate and develop a natural bicriteria measure for assessing the quality of a clustering which avoids the drawbacks of existing measures. A simple recursive heuristic is shown to have poly-logarithmic worst-case guarantees under the new measure. The main result of the paper is the analysis of a popular spectral algorithm. One variant of spectral clustering turns out to have effective worst-case guarantees; another finds a "good" clustering, if one exists.

Local search heuristics for k-median and facility location problems

by Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala, V. Pandit - STOC'01 , 2001
"... ..."
Abstract - Cited by 300 (13 self) - Add to MetaCart
Abstract not found

Clustering Data Streams

by Sudipto Guha, et al. , 2000
"... ..."
Abstract - Cited by 295 (12 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...Theorem 2.1). Throughout the paper we also assume that the input points are drawn from a metric space. In the recent past, several approximation algorithms have been proposed for the k{Median problem =-=[3, 10, 2]-=-. These algorithms require O(n 2 ) space to compute the dual variables or primal constraints. We will be interested in algorithms which use more than k medians but run in linear space [12, 2, 9]. Char...

A constant-factor approximation algorithm for the k-median problem

by Moses Charikar, Sudipto Guha, Éva Tardos, David B. Shmoys - In Proceedings of the 31st Annual ACM Symposium on Theory of Computing , 1999
"... We present the first constant-factor approximation algorithm for the metric k-median problem. The k-median problem is one of the most well-studied clustering problems, i.e., those problems in which the aim is to partition a given set of points into clusters so that the points within a cluster are re ..."
Abstract - Cited by 249 (13 self) - Add to MetaCart
We present the first constant-factor approximation algorithm for the metric k-median problem. The k-median problem is one of the most well-studied clustering problems, i.e., those problems in which the aim is to partition a given set of points into clusters so that the points within a cluster are relatively close with respect to some measure. For the metric k-median problem, we are given n points in a metric space. We select k of these to be cluster centers, and then assign each point to its closest selected center. If point j is assigned to a center i, the cost incurred is proportional to the distance between i and j. The goal is to select the k centers that minimize the sum of the assignment costs. We give a 6 2 3-approximation algorithm for this problem. This improves upon the best previously known result of O(log k log log k), which was obtained by refining and derandomizing a randomized O(log n log log n)-approximation algorithm of Bartal. 1
(Show Context)

Citation Context

... of cost at most 16 times the true optimal cost, but allows centers to be assigned at most 4U locations. There has also been a great deal of subsequent work to improve on our results. Jain & Vazirani =-=[20]-=- give an extremely elegant primal-dual 3-approximation algorithm for the uncapacitated facility location problem, and show how to use that procedure to obtain a 6-approximation algorithm for the k-med...

Improved Combinatorial Algorithms for the Facility Location and k-Median Problems

by Moses Charikar, Sudipto Guha - In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science , 1999
"... We present improved combinatorial approximation algorithms for the uncapacitated facility location and k-median problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 ..."
Abstract - Cited by 225 (12 self) - Add to MetaCart
We present improved combinatorial approximation algorithms for the uncapacitated facility location and k-median problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 + in ~ O(n 2 =) time. This also yields a bicriteria approximation tradeoff of (1 +; 1+ 2=) for facility cost versus service cost which is better than previously known tradeoffs and close to the best possible. Combining greedy improvement and cost scaling with a recent primal dual algorithm for facility location due to Jain and Vazirani, we get an approximation ratio of 1.853 in ~ O(n 3 ) time. This is already very close to the approximation guarantee of the best known algorithm which is LP-based. Further, combined with the best known LP-based algorithm for facility location, we get a very slight improvement in the approximation factor for facility location, achieving 1.728....
(Show Context)

Citation Context

...oys [6] gave thesrst constant factor approximation algorithm for the k-median problem. This is also based on rounding the solution to a linear programming relaxation. Very recently, Jain and Vazirani =-=[15]-=- gave primal-dual algorithms for facility location and k-median problems, achieving approximation ratios of 3 and 6 for the two problems. The running time of their algorithms is O(n 2 log n) for facil...

Incremental Clustering and Dynamic Information Retrieval

by Moses Charikar, Chandra Chekuri, Tomás Feder, Rajeev Motwani , 1997
"... Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retri ..."
Abstract - Cited by 191 (4 self) - Add to MetaCart
Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retrieval application, and which should also be useful in other applications. The goal is to efficiently maintain clusters of small diameter as new points are inserted. We analyze several natural greedy algorithms and demonstrate that they perform poorly. We propose new deterministic and randomized incremental clustering algorithms which have a provably good performance. We complement our positive results with lower bounds on the performance of incremental algorithms. Finally, we consider the dual clustering problem where the clusters are of fixed diameter, and the goal is to minimize the number of clusters.
(Show Context)

Citation Context

... to the minimum diameter and radius measures described above, several other objective functions have also been considered in the literature. A lot of recent work has focused on the k-median objective =-=[11, 36, 2]-=-. Here the goal is to assign points to k centers such that the sum of distances of points to their centers is minimized. Other objectives that have been studied include the objective of minimizing the...

Clustering data streams: Theory and practice

by Sudipto Guha, Adam Meyerson, Nina Mishra, Rajeev Motwani , Liadan O'Callaghan - IEEE TKDE , 2003
"... The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little memory, ..."
Abstract - Cited by 157 (5 self) - Add to MetaCart
The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little memory, is crucial. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm’s performance on synthetic and real data streams.
(Show Context)

Citation Context

...additional cost for each center included in the solution. There is abundant literature on these, books [45], [62], [55], provable algorithms [41], [49], [54], [53], [71], [31], [16], [15], [6], [52], =-=[47]-=-, [14], [59], [7], [46], the running time of provable clustering heuristics [21], [10], [42], [34], [73], [60], and special metric spaces [6], [52], [43], [68]. The k-Median problem is also relevant i...

Influence Sets Based on Reverse Nearest Neighbor Queries

by Flip Korn, S. Muthukrishnan - In SIGMOD , 2000
"... Inherent in the operation of many decision support and continuous referral systems is the notion of the "influence" of a data point on the database. This notion arises in examples such as finding the set of customers affected by the opening of a new store outlet location, notifying the sub ..."
Abstract - Cited by 148 (1 self) - Add to MetaCart
Inherent in the operation of many decision support and continuous referral systems is the notion of the "influence" of a data point on the database. This notion arises in examples such as finding the set of customers affected by the opening of a new store outlet location, notifying the subset of subscribers to a digital library who will find a newly added document most relevant, etc. Standard approaches to determining the influence set of a data point involve range searching and nearest neighbor queries. In this paper, we formalize a novel notion of influence based on reverse neighbor queries and its variants. Since the nearest neighbor relation is not symmetric, the set of points that are closest to a query point (i.e., the nearest neighbors) differs from the set of points that have the query point as their nearest neighbor (called the reverse nearest neighbors). Influence sets based on reverse nearest neighbor (RNN) queries seem to capture the intuitive notion of influence from our ...

Greedy Facility Location Algorithms analyzed using Dual Fitting with Factor-Revealing LP

by Mohammad Mahdian, Evangelos Markakis, Amin Saberi, Vijay Vazirani - Journal of the ACM , 2001
"... We present a natural greedy algorithm for the metric uncapacitated facility location problem and use the method of dual fitting to analyze its approximation ratio, which turns out to be 1.861. The running time of our algorithm is O(m log m), where m is the total number of edges in the underlying c ..."
Abstract - Cited by 146 (12 self) - Add to MetaCart
We present a natural greedy algorithm for the metric uncapacitated facility location problem and use the method of dual fitting to analyze its approximation ratio, which turns out to be 1.861. The running time of our algorithm is O(m log m), where m is the total number of edges in the underlying complete bipartite graph between cities and facilities. We use our algorithm to improve recent results for some variants of the problem, such as the fault tolerant and outlier versions. In addition, we introduce a new variant which can be seen as a special case of the concave cost version of this problem.
(Show Context)

Citation Context

..., Tardos, and Aardal [39]. Later, the factor was improved by Chudak and Shmoys [9] to 1 + 2/e. Both these algorithms were based on LP-rounding, and therefore had high running times. Jain and Vazirani =-=[22]-=- gave a primal–dual algorithm, achieving a factor of 3, and having the same running time as ours (we will refer to this as the JV algorithm). Their algorithm was adapted for solving several related pr...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University