Results 1 - 10
of
62
Geometric approximation via coresets
- Combinatorial and Computational Geometry, MSRI
, 2005
"... Abstract. The paradigm of coresets has recently emerged as a powerful tool for efficiently approximating various extent measures of a point set P. Using this paradigm, one quickly computes a small subset Q of P, called a coreset, that approximates the original set P and and then solves the problem o ..."
Abstract
-
Cited by 47 (7 self)
- Add to MetaCart
Abstract. The paradigm of coresets has recently emerged as a powerful tool for efficiently approximating various extent measures of a point set P. Using this paradigm, one quickly computes a small subset Q of P, called a coreset, that approximates the original set P and and then solves the problem on Q using a relatively inefficient algorithm. The solution for Q is then translated to an approximate solution to the original point set P. This paper describes the ways in which this paradigm has been successfully applied to various optimization and extent measure problems. 1.
Coresets for k-Means and k-Median Clustering and their Applications
- In Proc. 36th Annu. ACM Sympos. Theory Comput
, 2003
"... In this paper, we show the existence of small coresets for the problems of computing k-median and k-means clustering for points in low dimension. In other words, we show that given a point set P in IR , one can compute a weighted set S P , of size log n), such that one can compute the k-med ..."
Abstract
-
Cited by 41 (13 self)
- Add to MetaCart
In this paper, we show the existence of small coresets for the problems of computing k-median and k-means clustering for points in low dimension. In other words, we show that given a point set P in IR , one can compute a weighted set S P , of size log n), such that one can compute the k-median/means clustering on S instead of on P , and get an (1 + ")-approximation.
Shape Fitting with Outliers
- SIAM J. Comput
, 2003
"... we present an algorithm that "-approximates the extent between the top and bottom k levels of the arrangement of H in time O(n+(k=") ), where c is a constant depending on d. The algorithm relies on computing a subset of H of size O(k=" ), in near linear time, such that the k-level of the a ..."
Abstract
-
Cited by 26 (11 self)
- Add to MetaCart
we present an algorithm that "-approximates the extent between the top and bottom k levels of the arrangement of H in time O(n+(k=") ), where c is a constant depending on d. The algorithm relies on computing a subset of H of size O(k=" ), in near linear time, such that the k-level of the arrangement of the subset approximates that of the original arrangement. Using this algorithm, we propose ecient approximation algorithms for shape tting with outliers for various shapes. This is the rst algorithms to handle outliers eciently for the shape tting problems considered.
Practical Methods for Shape Fitting and Kinetic Data Structures using Core Sets
- In Proc. 20th Annu. ACM Sympos. Comput. Geom
, 2004
"... The notion of ε-kernel was introduced by Agarwal et al. [5] to set up a unified framework for computing various extent measures of a point set P approximately. Roughly speaking, a subset Q ⊆ P is an ε-kernel of P if for every slab W containing Q, the expanded slab (1 + ε)W contains P. They illustrat ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
The notion of ε-kernel was introduced by Agarwal et al. [5] to set up a unified framework for computing various extent measures of a point set P approximately. Roughly speaking, a subset Q ⊆ P is an ε-kernel of P if for every slab W containing Q, the expanded slab (1 + ε)W contains P. They illustrated the significance of ε-kernel by showing that it yields approximation algorithms for a wide range of geometric optimization problems. We present a simpler and more practical algorithm for computing the ε-kernel of a set P of points in R d. We demonstrate the practicality of our algorithm by showing its empirical performance on various inputs. We then describe an incremental algorithm for fitting various shapes and use the ideas of our algorithm for computing ε-kernels to analyze the performance of this algorithm. We illustrate the versatility and practicality of this technique by implementing approximation algorithms for minimum enclosing cylinder, minimum-volume bounding box, and minimum-width annulus. Finally, we show that ε-kernels can be effectively used to expedite the algorithms for maintaining extents of moving points. 1
Clustering Motion
- In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci
, 2003
"... Given a set of moving points in IR , we show how to cluster them in advance, using a small number of clusters, so that at any time this static clustering is competitive with the optimal k-center clustering at that time. The advantage of this approach is that it avoids updating the clustering a ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Given a set of moving points in IR , we show how to cluster them in advance, using a small number of clusters, so that at any time this static clustering is competitive with the optimal k-center clustering at that time. The advantage of this approach is that it avoids updating the clustering as time passes. We also show how to maintain this static clustering eciently under insertions and deletions.
RELATIVE-ERROR CUR MATRIX DECOMPOSITIONS
- SIAM J. MATRIX ANAL. APPL
, 2008
"... Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of “components.” Typically, these components are linear combinations of the rows and columns of the matrix, and are thus difficult to interpret in terms of the original features of the ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of “components.” Typically, these components are linear combinations of the rows and columns of the matrix, and are thus difficult to interpret in terms of the original features of the input data. In this paper, we propose and study matrix approximations that are explicitly expressed in terms of a small number of columns and/or rows of the data matrix, and thereby more amenable to interpretation in terms of the original data. Our main algorithmic results are two randomized algorithms which take as input an m × n matrix A and a rank parameter k. In our first algorithm, C is chosen, and we let A ′ = CC + A, where C + is the Moore–Penrose generalized inverse of C. In our second algorithm C, U, R are chosen, and we let A ′ = CUR. (C and R are matrices that consist of actual columns and rows, respectively, of A, and U is a generalized inverse of their intersection.) For each algorithm, we show that with probability at least 1 − δ, ‖A − A ′ ‖F ≤ (1 + ɛ) ‖A − Ak‖F, where Ak is the “best ” rank-k approximation provided by truncating the SVD of A, and where ‖X‖F is the Frobenius norm of the matrix X. The number of columns of C and rows of R is a low-degree polynomial in k, 1/ɛ, and log(1/δ). Both the Numerical Linear Algebra community and the Theoretical Computer Science community have studied variants
Adaptive spatial partitioning for multidimensional data streams
- In ISAAC
, 2004
"... We propose a space-efficient scheme for summarizing multidimensional data streams. Our sketch can be used to solve spatial versions of several classical data stream queries efficiently. For instance, we can track ε-hotspots, which are congruent boxes containing at least an ε fraction of the stream, ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
We propose a space-efficient scheme for summarizing multidimensional data streams. Our sketch can be used to solve spatial versions of several classical data stream queries efficiently. For instance, we can track ε-hotspots, which are congruent boxes containing at least an ε fraction of the stream, and maintain hierarchical heavy hitters in d dimensions. Our sketch can also be viewed as a multidimensional generalization of the ε-approximate quantile summary. The space complexity of our scheme is O ( 1 ε log R) if the points lie in the domain [0, R]d, where d is assumed to be a constant. The scheme extends to the sliding window model with a log(εn) factor increase in space, where n is the size of the sliding window. Our sketch can also be used to answer ε-approximate rectangular range queries over a stream of d-dimensional points. 1
Adaptive sampling for geometric problems over data streams
- In Proc. 23rd ACM Sympos. Principles Database Syst
, 2004
"... Geometric coordinates are an integral part of many data streams. Examples include sensor locations in environmental monitoring, vehicle locations in traffic monitoring or battlefield simulations, scientific measurements of earth or atmospheric phenomena, etc. How can one summarize such data streams ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
Geometric coordinates are an integral part of many data streams. Examples include sensor locations in environmental monitoring, vehicle locations in traffic monitoring or battlefield simulations, scientific measurements of earth or atmospheric phenomena, etc. How can one summarize such data streams using limited storage so that many natural geometric queries can be answered faithfully? Some examples of such queries are: report the smallest convex region in which a chemical leak has been sensed, or track the diameter of the dataset. One can also pose queries over multiple streams: track the minimum distance between the convex hulls of two data streams; or report when datasets A and B are no longer linearly separable. In this paper, we propose an adaptive sampling scheme that gives provably optimal error bounds for extremal problems of this nature. All our results follow from a single technique for computing the approximate convex hull of a point stream in a single pass. Our main result is this: given a stream of two-dimensional points and an integer r, wecan maintain an adaptive sample of at most 2r +1pointssuch that the distance between the true convex hull and the convex hull of the sample points is O(D/r 2), where D is the diameter of the sample set. With our sample convex hull, all the queries mentioned above can be answered in either O(log r) orO(r) time. 1
Fast Algorithms for Computing the Smallest k-Enclosing Disc
- In Proc. 11th Annu. European Sympos. Algorithms, volume 2832 of Lect. Notes in Comp. Sci
, 2003
"... We consider the problem of nding, for a given n point set P in the plane and an integer k n, the smallest circle enclosing at least k points of P . We present a randomized algorithm that computes in O(nk) expected time such a circle, improving over previously known algorithms. ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
We consider the problem of nding, for a given n point set P in the plane and an integer k n, the smallest circle enclosing at least k points of P . We present a randomized algorithm that computes in O(nk) expected time such a circle, improving over previously known algorithms.
Embeddings of surfaces, curves, and moving points in euclidean space
- In Proc. 23rd Annu. ACM Sympos. Comput. Geom
, 2007
"... In this paper we show that dimensionality reduction (i.e., Johnson-Lindenstrauss lemma) preserves not only the distances between static points, but also between moving points, and more generally between low-dimensional flats, polynomial curves, curves with low winding degree, and polynomial surfaces ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
In this paper we show that dimensionality reduction (i.e., Johnson-Lindenstrauss lemma) preserves not only the distances between static points, but also between moving points, and more generally between low-dimensional flats, polynomial curves, curves with low winding degree, and polynomial surfaces. We also show that surfaces with bounded doubling dimension can be embedded into low dimension with small additive error. Finally, we show that for points with polynomial motion, the radius of the smallest enclosing ball can be preserved under dimensionality reduction. 1

