Results 1 - 10
of
13
On approximating the depth and related problems
- In Proc. 16th ACM-SIAM Sympos. Discrete Algorithms
, 2005
"... We study the question of finding a deepest point in an arrangement of regions, and provide a fast algorithm for this problem using random sampling, showing it sufficient to solve this problem when the deepest point is shallow. This implies, among other results, a fast algorithm for solving linear pr ..."
Abstract
-
Cited by 54 (10 self)
- Add to MetaCart
We study the question of finding a deepest point in an arrangement of regions, and provide a fast algorithm for this problem using random sampling, showing it sufficient to solve this problem when the deepest point is shallow. This implies, among other results, a fast algorithm for solving linear programming with violations approximately. We also use this technique to approximate the disk covering the largest number of red points, while avoiding all the blue points, given two such sets in the plane. Using similar techniques imply that approximate range counting queries have roughly the same time
On approximate range counting and depth
- In Proc. 23rd Annu. ACM Sympos. Comput. Geom
, 2007
"... ABSTRACT We improve the previous results by Aronov and Har-Peled (SODA'05) and Kaplan and Sharir (SODA'06) and present a randomized data structure of O(n) expected size which can answer 3D approximate halfspace range counting queries in O(log n k) expected time, where k is the actual value of the co ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
ABSTRACT We improve the previous results by Aronov and Har-Peled (SODA'05) and Kaplan and Sharir (SODA'06) and present a randomized data structure of O(n) expected size which can answer 3D approximate halfspace range counting queries in O(log n k) expected time, where k is the actual value of the count. This is the first optimal method for the problem in the standard decision tree model; moreover, unlike previous methods, the new method is Las Vegas instead of Monte Carlo. In addition, we describe new results for several related problems, including approximate Tukey depth queries in 3D, approximate regression depth queries in 2D, and approximate linear programming with violations in low dimensions. Categories and Subject Descriptors F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems--geometrical problems and computations
On approximate halfspace range counting and relative epsilon-approximations
- In Proc. 23rd Annu. ACM Sympos. Comput. Geom
, 2007
"... The paper consists of two major parts. In the first part, we re-examine relative ε-approximations, previously studied in [12, 13, 18, 25], and their relation to certain geometric problems, most notably to approximate range counting. We give a simple constructive proof of their existence in general r ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
The paper consists of two major parts. In the first part, we re-examine relative ε-approximations, previously studied in [12, 13, 18, 25], and their relation to certain geometric problems, most notably to approximate range counting. We give a simple constructive proof of their existence in general range spaces with finite VC dimension, and of a sharp bound on their size, close to the best known one. We then give a construction of smaller-size relative ε-approximations for range spaces that involve points and halfspaces in two and higher dimensions. The planar construction is based on a new structure—spanning trees with small relative crossing number, which we believe to be of independent interest. In the second part, we consider the approximate halfspace range-counting problem in R d with relative error ε, and show that relative ε-approximations, combined with the shallow partitioning data structures of Matouˇsek, yields efficient solutions to this problem. For example, one of our data structures requires linear storage and O(n 1+δ) preprocessing time, for any δ> 0, and answers a query in time O(ε −γ n 1−1/⌊d/2 ⌋ 2 b log ∗ n), for any γ> 2/⌊d/2⌋; the choice of γ and δ affects b and the implied constants. Several variants and extensions are also discussed.
Bottom-k sketches: Better and more efficient estimation of aggregates
- In Proceedings of the ACM SIGMETRICS’07 Conference
, 2007
"... A Bottom- sketch is a summary of a set of items with nonnegative weights. Each such summary allows us to compute approximate aggregates over the set of items. Bottom-sketches are obtained by associating with each item in a ground set an independent random rank drawn from a probability distribution ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
A Bottom- sketch is a summary of a set of items with nonnegative weights. Each such summary allows us to compute approximate aggregates over the set of items. Bottom-sketches are obtained by associating with each item in a ground set an independent random rank drawn from a probability distribution that depends on the weight of the item. For each subset of interest, the bottom- sketch is the set of the minimum ranked items and their ranks. Bottomsketches have numerous applications. We develop and analyze data structures and estimators for bottom-sketches to facilitate their deployment. We develop novel estimators and algorithms that show that they are a superior alternative to other sketching methods in both efficiency of obtaining the sketches and the accuracy of the estimates derived from the sketches.
Range minima queries with respect to a random permutation, and approximate range counting, Discrete Comput
"... In approximate halfspace range counting, one is given a set P of n points in R d, and an ε> 0, and the goal is to preprocess P into a data structure which can answer efficiently queries of the form: Given a halfspace h, compute an estimate N such that (1−ε)|P ∩h | ≤ N ≤ (1+ε)|P ∩h|. Several recent ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
In approximate halfspace range counting, one is given a set P of n points in R d, and an ε> 0, and the goal is to preprocess P into a data structure which can answer efficiently queries of the form: Given a halfspace h, compute an estimate N such that (1−ε)|P ∩h | ≤ N ≤ (1+ε)|P ∩h|. Several recent papers have addressed this problem, including a study by the authors [18], which is based, as is the present paper, on Cohen’s technique for approximate range counting [9]. In this approach, one chooses a small number of random permutations of P, and then constructs, for each permutation π, a data structure that answers efficiently minimum range queries: Given a query halfspace h, find the minimum-rank element (according to π) in P ∩ h. By repeating this process for all chosen permutations, the approximate count can be obtained, with high probability, using a certain averaging process over the minimum-rank outputs. In the previous study, the authors have constructed such a data structure in R 3, using a combinatorial result about the overlay of minimization diagrams in a randomized incremental construction of lower envelopes. In the present work, we propose an alternative approach to the range-minimum problem,
The overlay of minimization diagrams in a randomized incremental construction
, 2008
"... In a randomized incremental construction of the minimization diagram of a collection of n hyperplanes in R d, the hyperplanes are inserted one by one, in a random order, and the minimization diagram is updated after each insertion. We show that if we retain all the versions of the diagram, without r ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In a randomized incremental construction of the minimization diagram of a collection of n hyperplanes in R d, the hyperplanes are inserted one by one, in a random order, and the minimization diagram is updated after each insertion. We show that if we retain all the versions of the diagram, without removing any old feature that is now replaced by new features, the expected combinatorial complexity of the resulting overlay does not grow significantly. Specifically, this complexity is O(n ⌊d/2 ⌋ log n), for d odd, and O(n ⌊d/2 ⌋), for d even. The bound is asymptotically tight in the worst case for d even, and we show that this is also the case for d = 3. Several implications of this bound, mainly its relation to approximate halfspace range counting, are also discussed.
Approximate Halfspace Range Counting
, 2008
"... We present a simple scheme extending the shallow partitioning data structures of Matouˇsek, that supports efficient approximate halfspace range-counting queries in R d with relative error ε. Specifically, the problem is, given a set P of n points in R d, to preprocess them into a data structure that ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We present a simple scheme extending the shallow partitioning data structures of Matouˇsek, that supports efficient approximate halfspace range-counting queries in R d with relative error ε. Specifically, the problem is, given a set P of n points in R d, to preprocess them into a data structure that returns, for a query halfspace h, a number t so that (1−ε)|h∩P | ≤ t ≤ (1+ε)|h∩P |. One of our data structures requires linear storage and O(n 1+δ) preprocessing time, for any δ> 0, and answers a query in time O ( ε −γ n 1−1/⌊d/2 ⌋ 2 blog ∗ n) , for any γ> 2/⌊d/2⌋; the choice of γ and δ affects b and the implied constants. Several variants and extensions are also discussed. As presented, the construction of the structure is mostly deterministic, except for one critical randomized step. The query efficiency is guaranteed with high probability, for all queries. The construction can also be fully derandomized, at the expense of increasing preprocessing time.
Relative ε-Approximations in Geometry ∗
, 2007
"... We re-examine relative ε-approximations, previously studied in [Pol86, Hau92, LLS01, CKMS06], and their relation to certain geometric problems. We give a simple constructive proof of their existence in general range spaces with finite VC-dimension, and of a sharp bound on their size, close to the be ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We re-examine relative ε-approximations, previously studied in [Pol86, Hau92, LLS01, CKMS06], and their relation to certain geometric problems. We give a simple constructive proof of their existence in general range spaces with finite VC-dimension, and of a sharp bound on their size, close to the best known one. We then give a construction of smaller-size relative ε-approximations for range spaces that involve points and halfspaces in two and higher dimensions. The planar construction is based on a new structure—spanning trees with small relative crossing number, which we believe to be of independent interest. We also consider applications of the new structures for approximate range counting and related problems. 1
A general approach for cache-oblivious range reporting and approximate range counting
- Computational Geometry: Theory and Applications
"... We present cache-oblivious solutions to two important variants of range searching: range reporting and approximate range counting. The main contribution of our paper is a general approach for constructing cache-oblivious data structures that provide relative (1+ε)-approximations for a general class ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We present cache-oblivious solutions to two important variants of range searching: range reporting and approximate range counting. The main contribution of our paper is a general approach for constructing cache-oblivious data structures that provide relative (1+ε)-approximations for a general class of range counting queries. This class includes three-sided range counting, 3-d dominance counting, and 3-d halfspace range counting. Our technique allows us to obtain data structures that use linear space and answer queries in the optimal query bound of O(logB (N/K)) block transfers in the worst case, where K is the number of points in the query range. Using the same technique, we also obtain the first approximate 3-d halfspace range counting and 3-d dominance counting data structures with a worst-case query time of O(log (N/K)) in internal memory. An easy but important consequence of our main result is the existence of O(N log N)-space cache-oblivious data structures with an optimal query bound of O(logB N+K/B) block transfers for the reporting versions of the above problems. Using standard reductions, these data structures allow us to obtain the first cache-oblivious data structures that use near-linear space and achieve the optimal query bound for circular range reporting and K-nearest neighbour searching in the plane, as well as for orthogonal range reporting in three dimensions. Part of this work was done while visiting Dalhousie University.
Leveraging discarded samples for tighter estimation of multiple-set aggregates
- In Proceedings of the ACM SIGMETRICS’09 Conference
, 2009
"... Many datasets such as market basket data, text or hypertext documents, and sensor observations recorded in different locations or time periods, are modeled as a collection of sets over a ground set of keys. We are interested in basic aggregates such as the weight or selectivity of keys that satisfy ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Many datasets such as market basket data, text or hypertext documents, and sensor observations recorded in different locations or time periods, are modeled as a collection of sets over a ground set of keys. We are interested in basic aggregates such as the weight or selectivity of keys that satisfy some selection predicate defined over keys ’ attributes and membership in particular sets. This general formulation includes basic aggregates such as the Jaccard coefficient, Hamming distance, and association rules. On massive data sets, exact computation can be inefficient or infeasible. Sketches based on coordinated random samples are classic summaries that support approximate query processing. Queries are resolved by generating a sketch (sample) of the union of sets used in the predicate from the sketches these sets and then applying an estimator to this union-sketch. We derive novel tighter (unbiased) estimators that leverage sampled keys that are present in the union of applicable sketches but excluded from the union sketch. We establish analytically that our estimators dominate estimators applied to the union-sketch for all queries and data sets. Empirical evaluation on synthetic and real data reveals that on typical applications we can expect a 25%-4 fold reduction in estimation error. 1.

