Results 1 
6 of
6
Applying parallel computation algorithms in the design of serial algorithms
 J. ACM
, 1983
"... Abstract. The goal of this paper is to point out that analyses of parallelism in computational problems have practical implications even when multiprocessor machines are not available. This is true because, in many cases, a good parallel algorithm for one problem may turn out to be useful for design ..."
Abstract

Cited by 232 (7 self)
 Add to MetaCart
Abstract. The goal of this paper is to point out that analyses of parallelism in computational problems have practical implications even when multiprocessor machines are not available. This is true because, in many cases, a good parallel algorithm for one problem may turn out to be useful for designing an efficient serial algorithm for another problem. A d ~ eframework d for cases like this is presented. Particular cases, which are discussed in this paper, provide motivation for examining parallelism in sorting, selection, minimumspanningtree, shortest route, maxflow, and matrix multiplication problems, as well as in scheduling and locational problems.
Coresets for weighted facilities and their applications
 In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06
, 2006
"... We develop efficient (1 + ε)approximation algorithms for generalized facility location problems. Such facilities are not restricted to being points in R d, and can represent more complex structures such as linear facilities (lines in R d, jdimensional flats), etc. We introduce coresets for weighte ..."
Abstract

Cited by 15 (7 self)
 Add to MetaCart
We develop efficient (1 + ε)approximation algorithms for generalized facility location problems. Such facilities are not restricted to being points in R d, and can represent more complex structures such as linear facilities (lines in R d, jdimensional flats), etc. We introduce coresets for weighted (point) facilities. These prove to be useful for such generalized facility location problems, and provide efficient algorithms for their construction. Applications include: kmean and kmedian generalizations, i.e., find k lines that minimize the sum (or sum of squares) of the distances from each input point to its nearest line. Other applications are generalizations of linear regression problems to multiple regression lines, new SVD/PCA generalizations, and many more. The results significantly improve on previous work, which deals efficiently only with special cases. Open source code for the algorithms in this paper is also available. 1
Consecutive optimizers for a partitioning problem with applications to optimal inventory groupings for joint replenishment
 Operations Research
, 1985
"... We consider several subclasses of the problem of grouping n items (indexed 1, 2,.., n) into m subsets so as to minimize the function g(S 1,.., S,). In general, these problems are very difficult to solve to optimality, even for the case m = 2. We provide several sufficient conditions on g(') that gua ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
We consider several subclasses of the problem of grouping n items (indexed 1, 2,.., n) into m subsets so as to minimize the function g(S 1,.., S,). In general, these problems are very difficult to solve to optimality, even for the case m = 2. We provide several sufficient conditions on g(') that guarantee that there is an optimum partition in which each subset consists of consecutive integers (or else the partition S,,, S,, satisfies a more general condition called semiconsecutiveness"). Moreover, by restricting attention to 'consecutive" (or serniconsecutive " ) partitions, we can solve the partition problem in polynomial time for small values of m. If, in addition, g is symmetric, then the partition problem is solvable in purely polynomial time. We apply these results to generalizations of a problem in inventory groupings considered by the authors in a previous paper. We also relate the results to the NeymanPearson lemma in statistical hypothesis testing and to a graph partitioning problem of Barnes and Hoffman. C · lg · · ·II CIL ·I D···1C · · 111ET a, , a and b,, b be real numbers ordered so that for some integer 0 r n, b, *.., b, are negative, b,+,.., b are nonnegative and al ar c. and tbi I b ar+l an br+i bn For b, = 0, we consider adb, to be +cc or according to a> 0 or a, < 0. If ai = bi = 0, al/b1 is defined arbitrarily so that inequality (1) holds. As usual, we let a and b denote the vectors whose coordinates are a, and bi, respectively. Subject clasification: 334 partitioning items into subgroups, 625 optimal inventory groupings.
A unified framework for approximating and clustering data. Manuscript available at arXiv.org
, 2011
"... Given a set F of n positive functions over a ground set X, we consider the problem of computing x ∗ that minimizes the expression ∑ f∈F f(x), over x ∈ X. A typical application is shape fitting, where we wish to approximate a set P of n elements (say, points) by a shape x from a (possibly infinite) f ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
Given a set F of n positive functions over a ground set X, we consider the problem of computing x ∗ that minimizes the expression ∑ f∈F f(x), over x ∈ X. A typical application is shape fitting, where we wish to approximate a set P of n elements (say, points) by a shape x from a (possibly infinite) family X of shapes. Here, each point p ∈ P corresponds to a function f such that f(x) is the distance from p to x, and we seek a shape x that minimizes the sum of distances from each point in P. In the kclustering variant, each x ∈ X is a tuple ofk shapes, andf(x) is the distance frompto its closest shape inx. Our main result is a unified framework for constructing coresets and approximate clustering for such general sets of functions. To achieve our results, we forge a link between the classic and well defined notion of εapproximations from the theory of PAC Learning and VC dimension, to the relatively new (and not so consistent) paradigm of coresets, which are some kind of “compressed representation " of the input set F. Using traditional techniques, a coreset usually implies an LTAS (linear time approximation scheme) for the corresponding optimization problem, which can be computed in parallel, via one pass over the data, and using only polylogarithmic space (i.e, in the streaming model). For several function families F for which coresets are known not to exist, or the corresponding (approximate) optimization problems are hard, our framework yields bicriteria approximations, or coresets that are large, but contained in a lowdimensional space. We demonstrate our unified framework by applying it on projective clustering problems. We obtain new coreset constructions and significantly smaller coresets, over the ones that
An Effective Coreset Compression Algorithm for Large Scale Sensor Networks
"... The wide availability of networked sensors such as GPS and cameras is enabling the creation of sensor networks that generate huge amounts of data. For example, vehicular sensor networks where incar GPS sensor probes are used to model and monitor traffic can generate on the order of gigabytes of dat ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
The wide availability of networked sensors such as GPS and cameras is enabling the creation of sensor networks that generate huge amounts of data. For example, vehicular sensor networks where incar GPS sensor probes are used to model and monitor traffic can generate on the order of gigabytes of data in real time. How can we compress streaming highfrequency data from distributed sensors? In this paper we construct coresets for streaming motion. The coreset of a data set is a small set which approximately represents the original data. Running queries or fitting models on the coreset will yield similar results when applied to the original data set. We present an algorithm for computing a small coreset of a large sensor data set. Surprisingly, the size of the coreset is independent of the size of the original data set. Combining mapandreduce techniques with our coreset yields a system capable of compressing in parallel a stream of O(n) points using space and update time that is only O(log n). We provide experimental results and compare the algorithm to the popular DouglasPeucker heuristic for compressing GPS data.
Locating LeastDistant Lines in the Plane
, 1996
"... In this paper we deal with locating a line in the plane. If d is a distance measure our objective is to find a straight line l which minimizes f(l) = M X m=1 wmd(Exm ..."
Abstract
 Add to MetaCart
In this paper we deal with locating a line in the plane. If d is a distance measure our objective is to find a straight line l which minimizes f(l) = M X m=1 wmd(Exm