Results 1  10
of
91
Improved Combinatorial Algorithms for the Facility Location and kMedian Problems
 In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science
, 1999
"... We present improved combinatorial approximation algorithms for the uncapacitated facility location and kmedian problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 ..."
Abstract

Cited by 209 (14 self)
 Add to MetaCart
We present improved combinatorial approximation algorithms for the uncapacitated facility location and kmedian problems. Two central ideas in most of our results are cost scaling and greedy improvement. We present a simple greedy local search algorithm which achieves an approximation ratio of 2:414 + in ~ O(n 2 =) time. This also yields a bicriteria approximation tradeoff of (1 +; 1+ 2=) for facility cost versus service cost which is better than previously known tradeoffs and close to the best possible. Combining greedy improvement and cost scaling with a recent primal dual algorithm for facility location due to Jain and Vazirani, we get an approximation ratio of 1.853 in ~ O(n 3 ) time. This is already very close to the approximation guarantee of the best known algorithm which is LPbased. Further, combined with the best known LPbased algorithm for facility location, we get a very slight improvement in the approximation factor for facility location, achieving 1.728....
Clustering data streams: Theory and practice
 IEEE TKDE
, 2003
"... Abstract—The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little ..."
Abstract

Cited by 106 (2 self)
 Add to MetaCart
Abstract—The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little memory, is crucial. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm’s performance on synthetic and real data streams. Index Terms—Clustering, data streams, approximation algorithms. 1
Greedy Facility Location Algorithms analyzed using Dual Fitting with FactorRevealing LP
 Journal of the ACM
, 2001
"... We present a natural greedy algorithm for the metric uncapacitated facility location problem and use the method of dual fitting to analyze its approximation ratio, which turns out to be 1.861. The running time of our algorithm is O(m log m), where m is the total number of edges in the underlying c ..."
Abstract

Cited by 100 (13 self)
 Add to MetaCart
We present a natural greedy algorithm for the metric uncapacitated facility location problem and use the method of dual fitting to analyze its approximation ratio, which turns out to be 1.861. The running time of our algorithm is O(m log m), where m is the total number of edges in the underlying complete bipartite graph between cities and facilities. We use our algorithm to improve recent results for some variants of the problem, such as the fault tolerant and outlier versions. In addition, we introduce a new variant which can be seen as a special case of the concave cost version of this problem.
Adwords and generalized online matching
 In FOCS ’05: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
, 2005
"... How does a search engine company decide what ads to display with each query so as to maximize its revenue? This turns out to be a generalization of the online bipartite matching problem. We introduce the notion of a tradeoff revealing LP and use it to derive two optimal algorithms achieving competit ..."
Abstract

Cited by 99 (5 self)
 Add to MetaCart
How does a search engine company decide what ads to display with each query so as to maximize its revenue? This turns out to be a generalization of the online bipartite matching problem. We introduce the notion of a tradeoff revealing LP and use it to derive two optimal algorithms achieving competitive ratios of 1 − 1/e for this problem. 1
Packing Steiner trees
"... The Steiner packing problem is to find the maximum number of edgedisjoint subgraphs of a given graph G that connect a given set of required points S. This problem is motivated by practical applications in VLSIlayout and broadcasting, as well as theoretical reasons. In this paper, we study this p ..."
Abstract

Cited by 88 (5 self)
 Add to MetaCart
The Steiner packing problem is to find the maximum number of edgedisjoint subgraphs of a given graph G that connect a given set of required points S. This problem is motivated by practical applications in VLSIlayout and broadcasting, as well as theoretical reasons. In this paper, we study this problem and present an algorithm with an asymptotic approximation factor of S/4. This gives a sufficient condition for the existence of k edgedisjoint Steiner trees in a graph in terms of the edgeconnectivity of the graph. We will show that this condition is the best possible if the number of terminals is 3. At the end, we consider the fractional version of this problem, and observe that it can be reduced to the minimum Steiner tree problem via the ellipsoid algorithm.
Allocating online advertisement space with unreliable estimates
 In Proceedings of the 8th ACM Conference on Electronic Commerce (EC
, 2007
"... We study the problem of optimally allocating online advertisement space to budgetconstrained advertisers. This problem was defined and studied from the perspective of worstcase online competitive analysis by Mehta et al. Our objective is to find an algorithm that takes advantage of the given estim ..."
Abstract

Cited by 45 (7 self)
 Add to MetaCart
We study the problem of optimally allocating online advertisement space to budgetconstrained advertisers. This problem was defined and studied from the perspective of worstcase online competitive analysis by Mehta et al. Our objective is to find an algorithm that takes advantage of the given estimates of the frequencies of keywords to compute a near optimal solution when the estimates are accurate, while at the same time maintaining a good worstcase competitive ratio in case the estimates are totally incorrect. This is motivated by realworld situations where search engines have stochastic information that provide reasonably accurate estimates of the frequency of search queries except in certain highly unpredictable yet economically valuable spikes in the search pattern. Our approach is a blackbox approach: we assume we have access to an oracle that uses the given estimates to recommend an advertiser every time a query arrives. We use this oracle to design an algorithm that provides two performance guarantees: the performance guarantee in the case that the oracle gives an accurate estimate, and its worstcase performance guarantee. Our algorithm can be fine tuned by adjusting a parameter α, giving a tradeoff curve between the two performance measures with the best competitive ratio for the worstcase scenario at one end of the curve and the optimal solution for the scenario where estimates are accurate at the other end. Finally, we demonstrate the applicability of our framework by applying it to two classical online problems, namely the lost cow and the ski rental problems.
Strategyproof Costsharing Mechanisms for Set Cover and Facility Location Games
, 2003
"... this paper, we obtain strategyproof cost allocations for two fundamental games whose underlying optimization problems are NPhard, the set cover game and the facility location game. For the latter game, this is made possible by new approximation algorithms for the underlying optimization problem usi ..."
Abstract

Cited by 44 (0 self)
 Add to MetaCart
this paper, we obtain strategyproof cost allocations for two fundamental games whose underlying optimization problems are NPhard, the set cover game and the facility location game. For the latter game, this is made possible by new approximation algorithms for the underlying optimization problem using the technique of dual fitting [7]. In retrospect, the natural greedy algorithm for the set cover problem (see [17]) can also analyzed using this technique  we utilize this viewpoint for handling the set cover game. The facility location game was studied in [9, 4], who left the open problem of obtaining a group strategyproof mechanism based on a constant factor approximation algorithm. Our paper partially answers this question. We give a strategyproof mechanism, but cannot achieve group strategyproofness. More recently, Pal and Tardos [15] have announced a 3approximately budget balanced crossmonotonic costsharing method for the facility location problem. This gives a group strategyproof mechanism for the facility location game that recovers 3 rd of the cost
Approximate kMSTs and kSteiner trees via the primaldual method and Lagrangean relaxation
 MATHEMATICAL PROGRAMMING
, 2001
"... Garg [10] gives two approximation algorithms for the minimumcost tree spanning k vertices in an undirected graph. Recently Jain and Vazirani [16] discovered primaldual approximation ..."
Abstract

Cited by 41 (4 self)
 Add to MetaCart
Garg [10] gives two approximation algorithms for the minimumcost tree spanning k vertices in an undirected graph. Recently Jain and Vazirani [16] discovered primaldual approximation
Approximate Clustering without the Approximation
"... Approximation algorithms for clustering points in metric spaces is a flourishing area of research, with much research effort spent on getting a better understanding of the approximation guarantees possible for many objective functions such as kmedian, kmeans, and minsum clustering. This quest for ..."
Abstract

Cited by 35 (18 self)
 Add to MetaCart
Approximation algorithms for clustering points in metric spaces is a flourishing area of research, with much research effort spent on getting a better understanding of the approximation guarantees possible for many objective functions such as kmedian, kmeans, and minsum clustering. This quest for better approximation algorithms is further fueled by the implicit hope that these better approximations also give us more accurate clusterings. E.g., for many problems such as clustering proteins by function, or clustering images by subject, there is some unknown “correct” target clustering and the implicit hope is that approximately optimizing these objective functions will in fact produce a clustering that is close (in symmetric difference) to the truth. In this paper, we show that if we make this implicit assumption explicit—that is, if we assume that any capproximation to the given clustering objective F is ǫclose to the target—then we can produce clusterings that are O(ǫ)close to the target, even for values c for which obtaining a capproximation is NPhard. In particular, for kmedian and kmeans objectives, we show that we can achieve this guarantee for any constant c> 1, and for minsum objective we can do this for any constant c> 2. Our results also highlight a somewhat surprising conceptual difference between assuming that the optimal solution to, say, the kmedian objective is ǫclose to the target, and assuming that any approximately optimal solution is ǫclose to the target, even for approximation factor say c = 1.01. In the former case, the problem of finding a solution that is O(ǫ)close to the target remains computationally hard, and yet for the latter we have an efficient algorithm.
Optimal Time Bounds for Approximate Clustering
, 2002
"... Clusteringisafundamentalprobleminunsupervised learning, andhasbeenstudiedwidelyboth asaproblemoflearningmixture modelsandasanoptimizationproblem. Inthispaper, we studyclusteringwithrespectthe kmedian objectivefunction, anaturalformulationofclusteringin whichweattempttominimize the average distance ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
Clusteringisafundamentalprobleminunsupervised learning, andhasbeenstudiedwidelyboth asaproblemoflearningmixture modelsandasanoptimizationproblem. Inthispaper, we studyclusteringwithrespectthe kmedian objectivefunction, anaturalformulationofclusteringin whichweattempttominimize the average distancetoclustercenters. Oneofthe maincontributionsofthispaperisasimplebutpowerful samplingtechniquethatwecall successivesampling thatcouldbeofindependentinterest. Weshowthatoursamplingprocedurecan rapidlyidentify asmallsetofpoints(ofsizejust O(k log n/k))thatsummarizetheinputpoints forthepurposeofclustering. Usingsuccessive sampling, we develop analgorithmforthe kmedianproblemthatrunsin O(nk) timeforawiderangeof valuesof k andisguaranteed, with high probability, to return a solution with cost at most a constant factor times optimal. We also establish a lower bound of \Omega ( nk) onanyrandomizedconstantfactorapproximation algorithm for the kmedian problem that succeeds with even a negligible (say