Results 1 - 10
of
200
On Clusterings: Good, Bad and Spectral
, 2000
"... We motivate and develop a natural bicriteria measure for assessing the quality of a clustering which avoids the drawbacks of existing measures. A simple recursive heuristic has poly-logarithmic worst-case guarantees under the new measure. The main result of the paper is the analysis of a popular spe ..."
Abstract
-
Cited by 203 (10 self)
- Add to MetaCart
We motivate and develop a natural bicriteria measure for assessing the quality of a clustering which avoids the drawbacks of existing measures. A simple recursive heuristic has poly-logarithmic worst-case guarantees under the new measure. The main result of the paper is the analysis of a popular spectral algorithm. One variant of spectral clustering turns out to have effective worst-case guarantees
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems
- In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science
, 1999
"... We present improved combinatorial approximation algorithms for the uncapacitated facility location and k-median problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 ..."
Abstract
-
Cited by 187 (12 self)
- Add to MetaCart
We present improved combinatorial approximation algorithms for the uncapacitated facility location and k-median problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 + in ~ O(n 2 =) time. This also yields a bicriteria approximation tradeoff of (1 +; 1+ 2=) for facility cost versus service cost which is better than previously known tradeoffs and close to the best possible. Combining greedy improvement and cost scaling with a recent primal dual algorithm for facility location due to Jain and Vazirani, we get an approximation ratio of 1.853 in ~ O(n 3 ) time. This is already very close to the approximation guarantee of the best known algorithm which is LP-based. Further, combined with the best known LP-based algorithm for facility location, we get a very slight improvement in the approximation factor for facility location, achieving 1.728....
Local search heuristics for k-median and facility location problems
, 2001
"... ÔÖÓ��ÙÖ�ØÓØ���ÐÓ��ÐÓÔØ�ÑÙÑ�ÓÖ�Ñ����ÒÛ � Ö�Ø�ÓÓ��ÐÓ�ÐÐÝÓÔØ�ÑÙÑ×ÓÐÙØ�ÓÒÓ�Ø��Ò��Ù×�Ò�Ø�� × ÐÓ�Ð�ØÝ��ÔÓ��ÐÓ�Ð×��Ö�ÔÖÓ��ÙÖ��ר��Ñ�Ü�ÑÙÑ �Ñ����Ò�Ò����Ð�ØÝÐÓ�Ø�ÓÒÔÖÓ�Ð�Ñ×Ï���¬Ò�Ø� � ÁÒØ��×Ô�Ô�ÖÛ��Ò�ÐÝÞ�ÐÓ�Ð×��Ö���ÙÖ�ר�×�ÓÖØ�� ×�ÓÛØ��ØÐÓ�Ð×��Ö�Û�Ø�×Û�Ô×��×�ÐÓ�Ð�ØÝ��ÔÓ � ×�ÑÙÐØ�Ò�ÓÙ×ÐÝØ��ÒØ��ÐÓ�Ð�ØÝ��ÔÓ�Ø�� ..."
Abstract
-
Cited by 179 (9 self)
- Add to MetaCart
ÔÖÓ��ÙÖ�ØÓØ���ÐÓ��ÐÓÔØ�ÑÙÑ�ÓÖ�Ñ����ÒÛ � Ö�Ø�ÓÓ��ÐÓ�ÐÐÝÓÔØ�ÑÙÑ×ÓÐÙØ�ÓÒÓ�Ø��Ò��Ù×�Ò�Ø�� × ÐÓ�Ð�ØÝ��ÔÓ��ÐÓ�Ð×��Ö�ÔÖÓ��ÙÖ��ר��Ñ�Ü�ÑÙÑ �Ñ����Ò�Ò����Ð�ØÝÐÓ�Ø�ÓÒÔÖÓ�Ð�Ñ×Ï���¬Ò�Ø� � ÁÒØ��×Ô�Ô�ÖÛ��Ò�ÐÝÞ�ÐÓ�Ð×��Ö���ÙÖ�ר�×�ÓÖØ�� ×�ÓÛØ��ØÐÓ�Ð×��Ö�Û�Ø�×Û�Ô×��×�ÐÓ�Ð�ØÝ��ÔÓ � ×�ÑÙÐØ�Ò�ÓÙ×ÐÝØ��ÒØ��ÐÓ�Ð�ØÝ��ÔÓ�Ø��ÐÓ�Ð×��Ö � �Ü�ØÐÝ�Ï��ÒÛ�Ô�ÖÑ�ØÔ���Ð�Ø��רÓ��×Û�ÔÔ�� �ÑÔÖÓÚ�ר��ÔÖ�Ú�ÓÙ×�ÒÓÛÒ��ÔÔÖÓÜ�Ñ�Ø�ÓÒ�ÓÖØ�� × ÔÖÓ�Ð�Ñ�ÓÖÍÒ�Ô��Ø�Ø�����Ð�ØÝÐÓ�Ø�ÓÒÛ�×�ÓÛ ÔÖÓ��ÙÖ��×�Ü�ØÐÝ Ó�ÐÓ�Ð×��Ö��ÓÖ�Ñ����ÒØ��ØÔÖÓÚ���×��ÓÙÒ�� � Ô�Ö�ÓÖÑ�Ò��Ù�Ö�ÒØ��Û�Ø�ÓÒÐÝ�Ñ����Ò×Ì��×�Ð×Ó �ÔÌ��×�ר��¬Öר�Ò�ÐÝ×�× ×Û�ÔÔ�Ò�����Ð�ØÝ��×�ÐÓ�Ð�ØÝ��ÔÓ��Ü�ØÐÝÌ�� × �ÑÔÖÓÚ�ר����ÓÙÒ�Ó�ÃÓÖÙÔÓÐÙ�Ø�ÐÏ��Ð×ÓÓÒ ×���Ö��Ô��Ø�Ø�����Ð�ØÝÐÓ�Ø�ÓÒÔÖÓ�Ð�ÑÛ��Ö��� � Ø��ØÐÓ�Ð×��Ö�Û���Ô�ÖÑ�Ø×����Ò��ÖÓÔÔ�Ò��Ò� Ø�ÔÐ�ÓÔ��×Ó�����Ð�ØÝ�ÓÖØ��×ÔÖÓ�Ð�ÑÛ��ÒØÖÓ�Ù � ���Ð�ØÝ��×��Ô��ØÝ�Ò�Û��Ö��ÐÐÓÛ��ØÓÓÔ�ÒÑÙÐ ÐÓ�Ð×��Ö�Û���Ô�ÖÑ�Ø×Ø��×Ò�ÛÓÔ�Ö�Ø�ÓÒ��×�ÐÓ ���Ð�ØÝ�Ò��ÖÓÔ×Þ�ÖÓÓÖÑÓÖ����Ð�Ø��×Ï�ÔÖÓÚ�Ø��Ø �Ò�ÛÓÔ�Ö�Ø�ÓÒÛ���ÓÔ�Ò×ÓÒ�ÓÖÑÓÖ�ÓÔ��×Ó� � �Ð�ØÝ��Ô��ØÛ��Ò�Ò�� ÝÈ�ÖØ��ÐÐÝ×ÙÔÔÓÖØ���Ý���ÐÐÓÛ×��Ô�ÖÓÑÁÒ�Ó×Ý×Ì� � Ê�×��Ö�Ä� � ÒÓÐÓ���×ÄØ���Ò��ÐÓÖ � ÞËÙÔÔÓÖØ���Ý�ÊÇ������� � £È�ÖØ��ÐÐÝ×ÙÔÔÓÖØ���Ý���ÐÐÓÛ×��Ô�ÖÓÑÁ�ÅÁÒ���
A constant-factor approximation algorithm for the k-median problem
- In Proceedings of the 31st Annual ACM Symposium on Theory of Computing
, 1999
"... We present the first constant-factor approximation algorithm for the metric k-median problem. The k-median problem is one of the most well-studied clustering problems, i.e., those problems in which the aim is to partition a given set of points into clusters so that the points within a cluster are re ..."
Abstract
-
Cited by 168 (12 self)
- Add to MetaCart
We present the first constant-factor approximation algorithm for the metric k-median problem. The k-median problem is one of the most well-studied clustering problems, i.e., those problems in which the aim is to partition a given set of points into clusters so that the points within a cluster are relatively close with respect to some measure. For the metric k-median problem, we are given n points in a metric space. We select k of these to be cluster centers, and then assign each point to its closest selected center. If point j is assigned to a center i, the cost incurred is proportional to the distance between i and j. The goal is to select the k centers that minimize the sum of the assignment costs. We give a 6 2 3-approximation algorithm for this problem. This improves upon the best previously known result of O(log k log log k), which was obtained by refining and derandomizing a randomized O(log n log log n)-approximation algorithm of Bartal. 1
Correlation Clustering
- MACHINE LEARNING
, 2002
"... We consider the following clustering problem: we have a complete graph on # vertices (items), where each edge ### ## is labeled either # or depending on whether # and # have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as mu ..."
Abstract
-
Cited by 158 (4 self)
- Add to MetaCart
We consider the following clustering problem: we have a complete graph on # vertices (items), where each edge ### ## is labeled either # or depending on whether # and # have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as much as possible with the edge labels. That is, we want a clustering that maximizes the number of # edges within clusters, plus the number of edges between clusters (equivalently, minimizes the number of disagreements: the number of edges inside clusters plus the number of # edges between clusters). This formulation is motivated from a document clustering problem in which one has a pairwise similarity function # learned from past data, and the goal is to partition the current set of documents in a way that correlates with # as much as possible; it can also be viewed as a kind of "agnostic learning" problem. An interesting
Incremental Clustering and Dynamic Information Retrieval
, 1997
"... Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retri ..."
Abstract
-
Cited by 129 (3 self)
- Add to MetaCart
Motivated by applications such as document and image classification in information retrieval, we consider the problem of clustering dynamic point sets in a metric space. We propose a model called incremental clustering which is based on a careful analysis of the requirements of the information retrieval application, and which should also be useful in other applications. The goal is to efficiently maintain clusters of small diameter as new points are inserted. We analyze several natural greedy algorithms and demonstrate that they perform poorly. We propose new deterministic and randomized incremental clustering algorithms which have a provably good performance. We complement our positive results with lower bounds on the performance of incremental algorithms. Finally, we consider the dual clustering problem where the clusters are of fixed diameter, and the goal is to minimize the number of clusters. 1 Introduction We consider the following problem: as a sequence of points from a metric...
THE PRIMAL-DUAL METHOD FOR APPROXIMATION ALGORITHMS AND ITS APPLICATION TO NETWORK DESIGN PROBLEMS
"... The primal-dual method is a standard tool in the design of algorithms for combinatorial optimization problems. This chapter shows how the primal-dual method can be modified to provide good approximation algorithms for a wide variety of NP-hard problems. We concentrate on results from recent researc ..."
Abstract
-
Cited by 107 (7 self)
- Add to MetaCart
The primal-dual method is a standard tool in the design of algorithms for combinatorial optimization problems. This chapter shows how the primal-dual method can be modified to provide good approximation algorithms for a wide variety of NP-hard problems. We concentrate on results from recent research applying the primal-dual method to problems in network design.
Improved Approximation Algorithms for Metric Facility Location Problems
- In Proceedings of the 5th International Workshop on Approximation Algorithms for Combinatorial Optimization
, 2002
"... In this paper we present a 1.52-approximation algorithm for the metric uncapacitated facility location problem, and a 2-approximation algorithm for the metric capacitated facility location problem with soft capacities. Both these algorithms improve the best previously known approximation factor for ..."
Abstract
-
Cited by 100 (11 self)
- Add to MetaCart
In this paper we present a 1.52-approximation algorithm for the metric uncapacitated facility location problem, and a 2-approximation algorithm for the metric capacitated facility location problem with soft capacities. Both these algorithms improve the best previously known approximation factor for the corresponding problem, and our soft-capacitated facility location algorithm achieves the integrality gap of the standard LP relaxation of the problem. Furthermore, we will show, using a result of Thorup, that our algorithms can be implemented in quasi-linear time.
A new greedy approach for facility location problems
"... We present a simple and natural greedy algorithm for the metric uncapacitated facility location problem achieving an approximation guarantee of 1.61 whereas the best previously known was 1.73. Furthermore, we will show that our algorithm has a property which allows us to apply the technique of Lagra ..."
Abstract
-
Cited by 94 (9 self)
- Add to MetaCart
We present a simple and natural greedy algorithm for the metric uncapacitated facility location problem achieving an approximation guarantee of 1.61 whereas the best previously known was 1.73. Furthermore, we will show that our algorithm has a property which allows us to apply the technique of Lagrangian relaxation. Using this property, we can nd better approximation algorithms for many variants of the facility location problem, such as the capacitated facility location problem with soft capacities and a common generalization of the k-median and facility location problem. We will also prove a lower bound on the approximability of the k-median problem.

