Results 1  10
of
126
Approximation Algorithms for Projective Clustering
 Proceedings of the ACM SIGMOD International Conference on Management of data, Philadelphia
, 2000
"... We consider the following two instances of the projective clustering problem: Given a set S of n points in R d and an integer k ? 0; cover S by k hyperstrips (resp. hypercylinders) so that the maximum width of a hyperstrip (resp., the maximum diameter of a hypercylinder) is minimized. Let w ..."
Abstract

Cited by 246 (21 self)
 Add to MetaCart
We consider the following two instances of the projective clustering problem: Given a set S of n points in R d and an integer k ? 0; cover S by k hyperstrips (resp. hypercylinders) so that the maximum width of a hyperstrip (resp., the maximum diameter of a hypercylinder) is minimized. Let w be the smallest value so that S can be covered by k hyperstrips (resp. hypercylinders), each of width (resp. diameter) at most w : In the plane, the two problems are equivalent. It is NPHard to compute k planar strips of width even at most Cw ; for any constant C ? 0 [50]. This paper contains four main results related to projective clustering: (i) For d = 2, we present a randomized algorithm that computes O(k log k) strips of width at most 6w that cover S. Its expected running time is O(nk 2 log 4 n) if k 2 log k n; it also works for larger values of k, but then the expected running time is O(n 2=3 k 8=3 log 4 n). We also propose another algorithm that computes a c...
An Introduction to Spatial Database Systems
 THE VLDB JOURNAL
, 1994
"... We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language, and supports ..."
Abstract

Cited by 170 (7 self)
 Add to MetaCart
We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language, and supports
Fast Isocontouring for Improved Interactivity
 In Proceedings of 1996 Symposium on Volume Visualization
, 1996
"... We present an isocontouringalgorithm which is nearoptimal for realtime interaction and modification of isovalues in large datasets. A preprocessing step selects a subset S of the cells which are considered as seed cells. Given a particular isovalue, all cells in S which intersect the given isocont ..."
Abstract

Cited by 121 (31 self)
 Add to MetaCart
We present an isocontouringalgorithm which is nearoptimal for realtime interaction and modification of isovalues in large datasets. A preprocessing step selects a subset S of the cells which are considered as seed cells. Given a particular isovalue, all cells in S which intersect the given isocontour are extracted using a highperformance range search. Each connected component is swept out using a fast isocontour propagation algorithm. The computational complexity for the repeated action of seed point selection and isocontour propagation is O(logn 0 + k), where n 0 is the size of S and k is the size of the output. In the worst case, n 0 = O(n), where n is the number of cells, while in practical cases, n 0 is smaller than n by one to two orders of magnitude. The general case of seed set construction for a convex complex of cells is described, in addition to a specialized algorithm suitable for meshes of regular topology, including rectilinear and curvilinear meshes. Keyword...
Optimal Dynamic Interval Management in External Memory (Extended Abstract))
 IN PROC. IEEE SYMP. ON FOUNDATIONS OF COMP. SCI
, 1996
"... We present a space and I/Ooptimal externalmemory data structure for answering stabbing queries on a set of dynamically maintained intervals. Our data structure settles an open problem in databases and I/O algorithms by providing the first optimal externalmemory solution to the dynamic interval m ..."
Abstract

Cited by 84 (23 self)
 Add to MetaCart
We present a space and I/Ooptimal externalmemory data structure for answering stabbing queries on a set of dynamically maintained intervals. Our data structure settles an open problem in databases and I/O algorithms by providing the first optimal externalmemory solution to the dynamic interval management problem, which is a special case of 2dimensional range searching and a central problem for objectoriented and temporal databases and for constraint logic programming. Our data structure simultaneously uses optimal linear space (that is, O(N/B) blocks of disk space) and achieves the optimal O(log B N + T/B) I/O query bound and O(log B N ) I/O update bound, where B is the I/O block size and T the number of elements in the answer to a query. Our structure is also the first optimal external data structure for a 2dimensional range searching problem that has worstcase as opposed to amortized update bounds. Part of the data structure uses a novel balancing technique for efficient worstcase manipulation of balanced trees, which is of independent interest.
On the Analysis of Indexing Schemes
 In Proc. 16th ACM SIGACTSIGMODSIGART Symposium on Principles of Database Systems
, 1997
"... We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We define a framework for measuring the efficiency of an indexing scheme for a workload based on two characterizations: storage redundancy (how many times each item in the data s ..."
Abstract

Cited by 76 (8 self)
 Add to MetaCart
We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We define a framework for measuring the efficiency of an indexing scheme for a workload based on two characterizations: storage redundancy (how many times each item in the data set is stored), and access overhead (how many times more blocks than necessary does a query retrieve). Using this framework we present some initial results, showing upper and lower bounds and tradeoffs between them in the case of multidimensional range queries and set queries. 1 Introduction The success and ubiquity of the relational data model arguably owes much to the Btree, the access method breakthrough that accompanied it with superb timing [2]. It seems likely that access methods will continue to play an important role in, and largely determine the viability of, the novel data models currently under intense scrutiny in the database research community. The Btree is widely recognized...
GeoRelational Algebra: A Model and Query Language for Geometric Database Systems
 Int. Conf. on Extending Database Technology
, 1988
"... : The user's conceptual model of a database system for geometric data should be simple and precise: easy to learn and understand, with clearly defined semantics, expressive: allow to express with ease all desired query and data manipulation tasks, efficiently implementable. To achieve these goa ..."
Abstract

Cited by 69 (7 self)
 Add to MetaCart
: The user's conceptual model of a database system for geometric data should be simple and precise: easy to learn and understand, with clearly defined semantics, expressive: allow to express with ease all desired query and data manipulation tasks, efficiently implementable. To achieve these goals we propose to extend relational database management systems by integrating geometry at all levels: At the conceptual level, relational algebra is extended to include geometric data types and operators. At the implementation level, the wealth of algorithms and data structures for geometric problems developed in the past decade in the field of Computational Geometry is exploited.  The paper starts from a view of relational algebra as a manysorted algebra which allows to easily embed geometric data types and operators. A concrete algebra for twodimensional applications is developed. It can be used as a highly expressive retrieval and data manipulation language for geometric as well as standard...
Range Searching
, 1996
"... Range searching is one of the central problems in computational geometry, because it arises in many applications and a wide variety of geometric problems can be formulated as a rangesearching problem. A typical rangesearching problem has the following form. Let S be a set of n points in R d , an ..."
Abstract

Cited by 69 (1 self)
 Add to MetaCart
Range searching is one of the central problems in computational geometry, because it arises in many applications and a wide variety of geometric problems can be formulated as a rangesearching problem. A typical rangesearching problem has the following form. Let S be a set of n points in R d , and let R be a family of subsets; elements of R are called ranges . We wish to preprocess S into a data structure so that for a query range R, the points in S " R can be reported or counted efficiently. Typical examples of ranges include rectangles, halfspaces, simplices, and balls. If we are only interested in answering a single query, it can be done in linear time, using linear space, by simply checking for each point p 2 S whether p lies in the query range.
New data structures for orthogonal range searching
 In Proc. 41st IEEE Symposium on Foundations of Computer Science
, 2000
"... ..."
Stabbing the sky: Efficient skyline computation over sliding windows
 In ICDE
, 2005
"... We consider the problem of efficiently computing the skyline against the most recent N elements in a data stream seen so far. Specifically, we study the nofN skyline queries; that is, computing the skyline for the most recent n (∀n ≤ N) elements. Firstly, we developed an effective pruning techniqu ..."
Abstract

Cited by 65 (6 self)
 Add to MetaCart
We consider the problem of efficiently computing the skyline against the most recent N elements in a data stream seen so far. Specifically, we study the nofN skyline queries; that is, computing the skyline for the most recent n (∀n ≤ N) elements. Firstly, we developed an effective pruning technique to minimize the number of elements to be kept. It can be shown that on average storing only O(log d N) elements from the most recent N elements is sufficient to support the precise computation of all nofN skyline queries in a ddimension space if the data distribution on each dimension is independent. Then, a novel encoding scheme is proposed, together with efficient update techniques, for the stored elements, so that computing an nofN skyline query in a ddimension space takes O(log N + s) time that is reduced to O(d log log N + s) if the data distribution is independent, where s is the number of skyline points. Thirdly, a novel trigger based technique is provided to process continuous nofN skyline queries with O(δ) time to update the current result per new data element and O(log s) time to update the trigger list per result change, where δ is the number of element changes from the current result to the new result. Finally, we extend our techniques to computing the skyline against an arbitrary window in the most recent N elements. Besides theoretical performance guarantees, our extensive experiments demonstrated that the new techniques can support online skyline query computation over very rapid data streams. 1
Lower bounds for orthogonal range searching: I. the reporting case
 Journal of the ACM
, 1990
"... Abstract. We establish lower bounds on the complexity of orthogonal range reporting in the static case. Given a collection of n points in dspace and a box [a,, b,] x. x [ad, bd], report every point whose ith coordinate lies in [a,, biJ, for each i = 1,..., d. The collection of points is fixed once ..."
Abstract

Cited by 64 (4 self)
 Add to MetaCart
Abstract. We establish lower bounds on the complexity of orthogonal range reporting in the static case. Given a collection of n points in dspace and a box [a,, b,] x. x [ad, bd], report every point whose ith coordinate lies in [a,, biJ, for each i = 1,..., d. The collection of points is fixed once and for all and can be preprocessed. The box, on the other hand, constitutes a query that must be answered online. It is shown that on a pointer machine a query time of O(k + polylog(n)), where k is the number of points to be reported, can only be achieved at the expense of fl(n(logn/loglogn)d‘) storage. Interestingly, these bounds are optimal in the pointer machine model, but they can be improved (ever so slightly) on a random access machine. In a companion paper, we address the related problem of adding up weights assigned to the points in the query box.