Results 1 - 10
of
53
Multidimensional Access Methods
, 1998
"... Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that ..."
Abstract
-
Cited by 508 (3 self)
- Add to MetaCart
Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that overlap a given search region). More
Indexing the Positions of Continuously Moving Objects
, 2000
"... The coming years will witness dramatic advances in wireless communications as well as positioning technologies. As a result, tracking the changing positions of objects capable of continuous movement is becoming increasingly feasible and necessary. The present paper proposes a novel, R # -tree base ..."
Abstract
-
Cited by 282 (18 self)
- Add to MetaCart
The coming years will witness dramatic advances in wireless communications as well as positioning technologies. As a result, tracking the changing positions of objects capable of continuous movement is becoming increasingly feasible and necessary. The present paper proposes a novel, R # -tree based indexing technique that supports the efficient querying of the current and projected future positions of such moving objects. The technique is capable of indexing objects moving in one-, two-, and three-dimensional space. Update algorithms enable the index to accommodate a dynamic data set, where objects may appear and disappear, and where changes occur in the anticipated positions of existing objects. A comprehensive performance study is reported.
On Indexing Mobile Objects
, 1999
"... We show how to index mobile objects in one and two dimensions using efficient dynamic external memory data structures. The problem is motivated by real life applications in traffic monitoring, intelligent navigation and mobile communications domains. For the 1-dimensional case, we give (i) a dynamic ..."
Abstract
-
Cited by 187 (14 self)
- Add to MetaCart
We show how to index mobile objects in one and two dimensions using efficient dynamic external memory data structures. The problem is motivated by real life applications in traffic monitoring, intelligent navigation and mobile communications domains. For the 1-dimensional case, we give (i) a dynamic, external memory algorithm with guaranteed worst case performance and linear space and (ii) a practical approximation algorithm also in the dynamic, external memory setting, which has linear space and expected logarithmic query time. We also give an algorithm with guaranteed logarithmic query time for a restricted version of the problem. We present extensions of our techniques to two dimensions. In addition we give a lower bound on the number of I/O's needed to answer the d-dimensional problem. Initial experimental results and comparisons to traditional indexing approaches are also included. 1 Introduction Traditional database management systems assume that data stored in the database rem...
Generalized Search Trees for Database Systems
- IN PROC. 21 ST INTERNATIONAL CONFERENCE ON VLDB
, 1995
"... This paper introduces the Generalized Search Tree (GiST), an index structure supporting an extensible set of queries and data types. The GiST allows new data types to be indexed in a manner supporting queries natural to the types; this is in contrast to previous work on tree extensibility which only ..."
Abstract
-
Cited by 186 (19 self)
- Add to MetaCart
This paper introduces the Generalized Search Tree (GiST), an index structure supporting an extensible set of queries and data types. The GiST allows new data types to be indexed in a manner supporting queries natural to the types; this is in contrast to previous work on tree extensibility which only supported the traditional set of equality and range predicates. In a single data structure, the GiST provides all the basic search tree logic required by a database system, thereby unifying disparate structures such as B+-trees and R-trees in a single piece of code, and opening the application of search trees to general extensibility. To illustrate the exibility of the GiST, we provide simple method implementations that allow it to behave like a B+-tree, an R-tree, and an RD-tree, a new index for data with set-valued attributes. We also present a preliminary performance analysis of RD-trees, which leads to discussion on the nature of tree indices and how they behave for various datasets.
Top-k selection queries over relational databases: Mapping strategies and performance evaluation
- TODS
, 2002
"... In many applications, users specify target values for certain attributes, without requiring exact matches to these values in return. Instead, the result to such queries is typically a rank of the “top k ” tuples that best match the given attribute values. In this paper, we study the advantages and l ..."
Abstract
-
Cited by 82 (6 self)
- Add to MetaCart
In many applications, users specify target values for certain attributes, without requiring exact matches to these values in return. Instead, the result to such queries is typically a rank of the “top k ” tuples that best match the given attribute values. In this paper, we study the advantages and limitations of processing a top-k query by translating it into a single range query that a traditional relational database management system (RDBMS) can process efficiently. In particular, we study how to determine a range query to evaluate a top-k query by exploiting the statistics available to an RDBMS, and the impact of the quality of these statistics on the retrieval efficiency of the resulting scheme. We also report the first experimental evaluation of the mapping strategies over a real RDBMS, namely over Microsoft’s SQL Server 7.0. The experiments show that our new techniques are robust and significantly more efficient than previously known strategies requiring at least one sequential scan of the data sets.
On the Analysis of Indexing Schemes
- In Proc. 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems
, 1997
"... We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We define a framework for measuring the efficiency of an indexing scheme for a workload based on two characterizations: storage redundancy (how many times each item in the data s ..."
Abstract
-
Cited by 70 (8 self)
- Add to MetaCart
We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We define a framework for measuring the efficiency of an indexing scheme for a workload based on two characterizations: storage redundancy (how many times each item in the data set is stored), and access overhead (how many times more blocks than necessary does a query retrieve). Using this framework we present some initial results, showing upper and lower bounds and trade-offs between them in the case of multi-dimensional range queries and set queries. 1 Introduction The success and ubiquity of the relational data model arguably owes much to the B-tree, the access method breakthrough that accompanied it with superb timing [2]. It seems likely that access methods will continue to play an important role in, and largely determine the viability of, the novel data models currently under intense scrutiny in the database research community. The B-tree is widely recognized...
Indexing multi-dimensional uncertain data with arbitrary probability density functions
- In Proc. VLDB
, 2005
"... In an “uncertain database”, an object o is associated with a multi-dimensional probability density function (pdf), which describes the likelihood that o appears at each position in the data space. A fundamental operation is the “probabilistic range search ” which, given a value pq and a rectangular ..."
Abstract
-
Cited by 69 (10 self)
- Add to MetaCart
In an “uncertain database”, an object o is associated with a multi-dimensional probability density function (pdf), which describes the likelihood that o appears at each position in the data space. A fundamental operation is the “probabilistic range search ” which, given a value pq and a rectangular area rq, retrieves the objects that appear in rq with probabilities at least pq. In this paper, we propose the U-tree, an access method designed to optimize both the I/O and CPU time of range retrieval on multi-dimensional imprecise data. The new structure is fully dynamic (i.e., objects can be incrementally inserted/deleted in any order), and does not place any constraints on the data pdfs. We verify the query and update efficiency of U-trees with extensive experiments. 1
Efficient Indexing of Spatiotemporal Objects
, 2002
"... Spatiotemporal objects, i.e., objects which change their position and/or extent over time appear in many applications. In this paper we examine the problem of indexing large volumes of such data. Important in this environment is how the spatiotemporal objects move and/or change. We consider a rath ..."
Abstract
-
Cited by 54 (10 self)
- Add to MetaCart
Spatiotemporal objects, i.e., objects which change their position and/or extent over time appear in many applications. In this paper we examine the problem of indexing large volumes of such data. Important in this environment is how the spatiotemporal objects move and/or change. We consider a rather general case where object movements/changes are defined by combinations of polynomial functions. We further concentrate on "snapshot" as well as small "interval" queries as these are quite common when examining the history of the gathered data. The obvious approach that approximates each spatiotemporal object by an MBR and uses a traditional multidimensional access method to index them is inefficient. Objects that "live" for long time intervals have large MBRs which introduce a lot of empty space. Clustering long intervals has been dealt in temporal databases by the use of partially persistent indices. What differentiates this problem from traditional temporal indexing, is that objects are allowed to move/change during their lifetime. Better ways are thus needed to approximate general spatiotemporal objects. One obvious solution is to introduce artificial splits: the lifetime of a long-lived object is split into smaller consecutive pieces. This decreases the empty space but increases the number of indexed MBRs. We first give an optimal algorithm and a heuristic for splitting a given spatiotemporal object in a predefined number of pieces. Then, given an upper bound on the total number of possible splits, we present three algorithms that decide how the splits are distributed among all the objects so that the total empty space is minimized. The number of splits cannot be increased indefinitely since the extra objects will eventually affect query performance. Usi...
The Effect of Buffering on the Performance of R-Trees
, 1996
"... Past R-tree studies have focused on the number of nodes visited as a metric of query performance. Since database systems usually include a buffering mechanism we propose that the number of disk accesses is a more realistic measure of performance. We develop a buffer model to analyze the number of di ..."
Abstract
-
Cited by 52 (7 self)
- Add to MetaCart
Past R-tree studies have focused on the number of nodes visited as a metric of query performance. Since database systems usually include a buffering mechanism we propose that the number of disk accesses is a more realistic measure of performance. We develop a buffer model to analyze the number of disk accesses required for spatial queries using R-trees. The model can be used to evaluate the quality of R-tree update operations, such as various node splitting and tree restructuring policies, as measured by query performance on the resulting tree. We use our model to study the performance of three well known R-tree packing algorithms. We show that ignoring buffer behavior and using number of nodes accessed as a performance metric can lead to incorrect conclusions, not only quantitatively, but also qualitatively. In addition, we consider the problem of how many levels of the R-tree should be pinned in the buffer. This research was supported in part by the National Aeronautics and Space A...

