Results 1 
9 of
9
The Online Median Problem
 In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science
, 2000
"... We introduce a natural variant of the (metric uncapacitated) kmedian problem that we call the online median problem. Whereas the kmedian problem involves optimizing the simultaneous placement of k facilities, the online median problem imposes the following additional constraints: the facilities ar ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
We introduce a natural variant of the (metric uncapacitated) kmedian problem that we call the online median problem. Whereas the kmedian problem involves optimizing the simultaneous placement of k facilities, the online median problem imposes the following additional constraints: the facilities are placed one at a time; a facility cannot be moved once it is placed, and the total number of facilities to be placed, k, is not known in advance. The objective of an online median algorithm is to minimize the competitive ratio, that is, the worstcase ratio of the cost of an online placement to that of an optimal offline placement. Our main result is a lineartime constantcompetitive algorithm for the online median problem. In addition, we present a related, though substantially simpler, lineartime constantfactor approximation algorithm for the (metric uncapacitated) facility location problem. The latter algorithm is similar in spirit to the recent primaldualbased facility location algorithm of Jain and Vazirani, but our approach is more elementary and yields an improved running time.
Better Streaming Algorithms for Clustering Problems
 In Proc. of 35th ACM Symposium on Theory of Computing (STOC
, 2003
"... We study cluster ng pr blems in the str aming model, wher e the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of storSD space.Our mainr esult is a r ndomized algor ithm for kMedian prE lem which p duces a constant factor a ..."
Abstract

Cited by 70 (1 self)
 Add to MetaCart
We study cluster ng pr blems in the str aming model, wher e the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of storSD space.Our mainr esult is a r ndomized algor ithm for kMedian prE lem which p duces a constant factor appr oximation in one pass using storR4 space O(kpolylog n). This is a significant imp r vement of the prS ious best algor5 hm which yielded a 2 appr ximation using O(n )space.
Sequential and parallel algorithms for mixed packing and covering
 IN 42ND ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE
, 2001
"... We describe sequential and parallel algorithms that approximately solve linear programs with no negative coefficients (a.k.a. mixed packing and covering problems). For explicitly given problems, our fastest sequential algorithm returns a solution satisfying all constraints within a ¦ ¯ factor in Ç ..."
Abstract

Cited by 43 (2 self)
 Add to MetaCart
We describe sequential and parallel algorithms that approximately solve linear programs with no negative coefficients (a.k.a. mixed packing and covering problems). For explicitly given problems, our fastest sequential algorithm returns a solution satisfying all constraints within a ¦ ¯ factor in Ç Ñ � ÐÓ � Ñ � ¯ time, where Ñ is the number of constraints and � is the maximum number of constraints any variable appears in. Our parallel algorithm runs in time polylogarithmic in the input size times ¯ � and uses a total number of operations comparable to the sequential algorithm. The main contribution is that the algorithms solve mixed packing and covering problems (in contrast to pure packing or pure covering problems, which have only “� ” or only “� ” inequalities, but not both) and run in time independent of the socalled width of the problem.
The reverse greedy algorithm for the metric kmedian problem
 Information Processing Letters
"... The Reverse Greedy algorithm (RGreedy) for the kmedian problem works as follows. It starts by placing facilities on all nodes. At each step, it removes a facility to minimize the total distance to the remaining facilities. It stops when k facilities remain. We prove that, if the distance function i ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The Reverse Greedy algorithm (RGreedy) for the kmedian problem works as follows. It starts by placing facilities on all nodes. At each step, it removes a facility to minimize the total distance to the remaining facilities. It stops when k facilities remain. We prove that, if the distance function is metric, then the approximation ratio of RGreedy is between Ω(log n / loglog n) and O(log n).
Playing push vs pull: Models and algorithms for disseminating dynamic data in networks
 In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures
, 2006
"... Consider a network in which a collection of source nodes maintain and periodically update data objects for a collection of sink nodes, each of which periodically accesses the data originating from some specified subset of the source nodes. We consider the task of efficiently relaying the dynamically ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Consider a network in which a collection of source nodes maintain and periodically update data objects for a collection of sink nodes, each of which periodically accesses the data originating from some specified subset of the source nodes. We consider the task of efficiently relaying the dynamically changing data objects to the sinks from their sources of interest. Our focus is on the following “pushpull” approach for this data dissemination problem. Whenever a data object is updated, its source relays the update to a designated subset of nodes, its push set; similarly, whenever a sink requires an update, it propagates its query to a designated subset of nodes, its pull set. The push and pull sets need to be chosen such that every pull set of a sink intersects the push sets of all its sources of interest. We study the problem of choosing push sets and pull sets to minimize total global communication while satisfying all communication requirements. We formulate and study several variants of the above data dissemination problem, that take into account different paradigms for routing between sources (resp., sinks) and their push sets (resp., pull sets) – multicast, unicast, and controlled broadcast – as well as the aggregability of the data objects. Under the multicast model, we present an optimal polynomial time algorithm for tree networks, which yields a randomized O(log n)approximation algorithm for nnode general networks, for which the problem is hard to approximate within a constant factor. Under the unicast ∗ Chakinala, Kumarasubramanian, and Manokaran were partially supported by a generous gift from Northeastern
Online Medians via Online Bribery (Extended Abstract)
"... We then consider the competitive ratio with respect to size. An algorithm is ssizecompetitive if, for each k, the cost of Fk is at most the minimum cost of any set of k facilities, while the size of Fk is at most sk. We present optimally competitive algorithms for this problem. Our proofs reduce o ..."
Abstract
 Add to MetaCart
We then consider the competitive ratio with respect to size. An algorithm is ssizecompetitive if, for each k, the cost of Fk is at most the minimum cost of any set of k facilities, while the size of Fk is at most sk. We present optimally competitive algorithms for this problem. Our proofs reduce online medians to the following online bribery problem: faced with some unknown threshold T 2 R+, an algorithm must submit "bids " b 2 R+ until it submits a bid as large as T. The algorithm pays the sum of its bids. We describe optimally competitive algorithms for online bribery. Our results on costcompetitive online medians extend to approximately metric distance functions, online fractional medians, and online bicriteria approximation.
Slicing Distributed Systems 1
, 2008
"... Peertopeer (P2P) architectures support collaboration to accomplish tasks such as downloading data, VOIP telephone, and cooperative backups. However, enduser systems can be extremely heterogeneous, with heavytailed distributions of attributes such as storage space and bandwidth. In systems that i ..."
Abstract
 Add to MetaCart
Peertopeer (P2P) architectures support collaboration to accomplish tasks such as downloading data, VOIP telephone, and cooperative backups. However, enduser systems can be extremely heterogeneous, with heavytailed distributions of attributes such as storage space and bandwidth. In systems that ignore heterogeneity, performance suffers, hence most of the popular peertopeer applications are forced to classify nodes according to capacity, distinguishing superpeers (which play more active roles) from regular ones (which have limited roles). Similar issues arise in large data centers, where nodes may have widely variable configurations and performance. Our paper solves a generalized classification problem called slicing, which involves partitioning the nodes into k subsets using a onedimensional attribute. Here, we start by arguing that slicing is the most appropriate generalization of existing classification mechanisms. We review prior work on the problem, and introduce our new Sliver protocol. Theoretical and experimental evaluations show that Sliver converges more rapidly than alternatives, and its low cost makes it appealing in a wide range of practical settings.
General Terms
"... We study clustering problems in the streaming model, where the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of storage space. Our main result is a randomized algorithm for the k–Median problem which produces a constant factor approximatio ..."
Abstract
 Add to MetaCart
We study clustering problems in the streaming model, where the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of storage space. Our main result is a randomized algorithm for the k–Median problem which produces a constant factor approximation in one pass using storage space O(kpolylog n). This is a significant improvement of the previous best algorithm which yielded a 2 O(1/ɛ) approximation using O(n ɛ)space. Next we give a streaming algorithm for the k–Median problem with an arbitrary distance function. We also study algorithms for clustering problems with outliers in the streaming model. Here, we give bicriterion guarantees, producing constant factor approximations by increasing the allowed fraction of outliers slightly.