Results 1  10
of
117
An Introduction to Spatial Database Systems
 THE VLDB JOURNAL
, 1994
"... We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language, and supports ..."
Abstract

Cited by 176 (7 self)
 Add to MetaCart
We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language, and supports
Indexdriven similarity search in metric spaces
 ACM Transactions on Database Systems
, 2003
"... Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some similarity measure. In this article, we focus on methods for similarity search th ..."
Abstract

Cited by 139 (6 self)
 Add to MetaCart
Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some similarity measure. In this article, we focus on methods for similarity search that make the general assumption that similarity is represented with a distance metric d. Existing methods for handling similarity search in this setting typically fall into one of two classes. The first directly indexes the objects based on distances (distancebased indexing), while the second is based on mapping to a vector space (mappingbased approach). The main part of this article is dedicated to a survey of distancebased indexing methods, but we also briefly outline how search occurs in mappingbased methods. We also present a general framework for performing search based on distances, and present algorithms for common types of queries that operate on an arbitrary “search hierarchy. ” These algorithms can be applied on each of the methods presented, provided a suitable search hierarchy is defined.
Continuous Nearest Neighbor Search
, 2002
"... A continuous nearest neighbor query retrieves the nearest neighbor (NN) of every point on a line segment (e.g., "find all my nearest gas stations during my route from point s to point e"). The result contains a set of <point, interval> tuples, such that point is the NN of all po ..."
Abstract

Cited by 117 (10 self)
 Add to MetaCart
A continuous nearest neighbor query retrieves the nearest neighbor (NN) of every point on a line segment (e.g., "find all my nearest gas stations during my route from point s to point e"). The result contains a set of <point, interval> tuples, such that point is the NN of all points in the corresponding interval. Existing methods for continuous nearest neighbor search are based on the repetitive application of simple NN algorithms, which incurs significant overhead. In this paper we propose techniques that solve the problem by performing a single query for the whole input segment. As a result the cost, depending on the query and dataset characteristics, may drop by orders of magnitude.
Influence Sets Based on Reverse Nearest Neighbor Queries
 In SIGMOD
, 2000
"... Inherent in the operation of many decision support and continuous referral systems is the notion of the "influence" of a data point on the database. This notion arises in examples such as finding the set of customers affected by the opening of a new store outlet location, notifying the sub ..."
Abstract

Cited by 112 (1 self)
 Add to MetaCart
Inherent in the operation of many decision support and continuous referral systems is the notion of the "influence" of a data point on the database. This notion arises in examples such as finding the set of customers affected by the opening of a new store outlet location, notifying the subset of subscribers to a digital library who will find a newly added document most relevant, etc. Standard approaches to determining the influence set of a data point involve range searching and nearest neighbor queries. In this paper, we formalize a novel notion of influence based on reverse neighbor queries and its variants. Since the nearest neighbor relation is not symmetric, the set of points that are closest to a query point (i.e., the nearest neighbors) differs from the set of points that have the query point as their nearest neighbor (called the reverse nearest neighbors). Influence sets based on reverse nearest neighbor (RNN) queries seem to capture the intuitive notion of influence from our ...
Closest Pair Queries in Spatial Databases
 In Proceedings of the ACMSIGMOD Conference on Management of Data
, 2000
"... This paper addresses the problem of finding the K closest pairs between two spatial data sets, where each set is stored in a structure belonging in the Rtree family. Five different algorithms (four recursive and one iterative) are presented for solving this problem. The case of 1 closest pair is tr ..."
Abstract

Cited by 68 (9 self)
 Add to MetaCart
This paper addresses the problem of finding the K closest pairs between two spatial data sets, where each set is stored in a structure belonging in the Rtree family. Five different algorithms (four recursive and one iterative) are presented for solving this problem. The case of 1 closest pair is treated as a special case. An extensive study, based on experiments performed with synthetic as well as with real point data sets, is presented. A wide range of values for the basic parameters affecting the performance of the algorithms, especially the effect of overlap between the two data sets, is explored. Moreover, an algorithmic as well as an experimental comparison with existing incremental algorithms addressing the same problem is presented. In most settings, the new algorithms proposed clearly outperform the existing ones. 1
Efficient Record Linkage in Large Data Sets
, 2003
"... This paper describes an efficient approach to record linkage. Given two lists of records, the recordlinkage problem consists of determining all pairs that are similar to each other, where the overall similarity between two records is defined based on domainspecific similarities over individual att ..."
Abstract

Cited by 60 (11 self)
 Add to MetaCart
This paper describes an efficient approach to record linkage. Given two lists of records, the recordlinkage problem consists of determining all pairs that are similar to each other, where the overall similarity between two records is defined based on domainspecific similarities over individual attributes constituting the record. The recordlinkage problem arises naturally in the context of data cleansing that usually precedes data analysis and mining. The scalability issue of record linkage has previously been addressed in [15], where the authors proposed a sorted neighborhood approach to improve performance. Since that original work, the repertoire of database techniques dealing with multidimensional data sets has significantly increased. Specifically, many effective and efficient approaches for distancepreserving transforms and similarity joins have been developed. Based on these advances, we explore a novel approach to record linkage. For each attribute of records, we first map values to a multidimensional Euclidean space that preserves domainspecific similarity. Many mapping algorithms can be applied, and we use the Fastmap approach [12] as an example. Given the merging rule that defines when two records are similar based on their attributelevel similarities, a set of attributes are chosen along which the merge will proceed. A multidimensional similarity join (using the algorithm proposed by Hjaltason and Samet [16]) over the chosen attributes is used to determine similar pairs of records. Our extensive experiments using real data sets show that our solution has very good efficiency and accuracy.
Dynamic Queries over Mobile Objects
 In EDBT
, 2002
"... Increasingly applications require the storage and retrieval of spatiotemporal information in a database management system. A type of such information is mobile objects, i.e., objects whose location changes continuously with time. Various techniques have been proposed to address problems of inco ..."
Abstract

Cited by 51 (2 self)
 Add to MetaCart
Increasingly applications require the storage and retrieval of spatiotemporal information in a database management system. A type of such information is mobile objects, i.e., objects whose location changes continuously with time. Various techniques have been proposed to address problems of incorporating such objects in databases. In this paper, we introduce new query processing techniques for dynamic queries over mobile objects, i.e., queries that are themselves continuously changing with time. Dynamic queries are natural in situational awareness systems when an observer is navigating through space. All objects visible by the observer must be retrieved and presented to her at very high rates, to ensure a highquality visualization. We show how our proposed techniques oer a great performance improvement over a traditional approach of multiple instantaneous queries.
Aggregate Nearest Neighbor Queries in Spatial Databases
 TODS
, 2005
"... Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,... qn,anANN query outputs the facility p ∈ P that minimizes t ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,... qn,anANN query outputs the facility p ∈ P that minimizes the sum of distances pqi  for 1 ≤ i ≤ n that the users have to travel in order to meet there. Similarly, another ANN query may report the point p ∈ P that minimizes the maximum distance that any user has to travel, or the minimum distance from some user to his/her closest facility. If Q fits in memory and P is indexed by an Rtree, we develop algorithms for aggregate nearest neighbors that capture several versions of the problem, including weighted queries and incremental reporting of results. Then, we analyze their performance and propose cost models for query optimization. Finally, we extend our techniques for diskresident queries and approximate ANN retrieval. The efficiency of the algorithms and the accuracy of the cost models are evaluated through extensive experiments with real and synthetic datasets.
Multiway Spatial Joins
 ACM Transactions on Database Systems (TODS
, 2001
"... Due to the evolution of Geographical Information Systems, large collections of spatial data having various thematic contents are currently available. As a result, the interest of users is not limited to simple spatial selections and joins, but complex query types that implicate numerous spatial inpu ..."
Abstract

Cited by 35 (8 self)
 Add to MetaCart
Due to the evolution of Geographical Information Systems, large collections of spatial data having various thematic contents are currently available. As a result, the interest of users is not limited to simple spatial selections and joins, but complex query types that implicate numerous spatial inputs become more common. Although several algorithms have been proposed for computing the result of pairwise spatial joins, limited work exists on processing and optimization of multiway spatial joins. In this article, we review pairwise spatial join algorithms and show how they can be combined for multiple inputs. In addition, we explore the application of synchronous traversal (ST), a methodology that processes synchronously all inputs without producing intermediate results. Then, we integrate the two approaches in an engine that includes ST and pairwise algorithms, using dynamic programming to determine the optimal execution plan. The results show that, in most cases, multiway spatial joins are best processed by combining ST with pairwise methods. Finally, we study the optimization of very large queries by employing randomized search algorithms.
Group nearest neighbor queries
 In ICDE
, 2004
"... Given two sets of points P and Q, a group nearest neighbor (GNN) query retrieves the point(s) of P with the smallest sum of distances to all points in Q. Consider, for instance, three users at locations q1, q2 and q3 that want to find a meeting point (e.g., a restaurant); the corresponding query ret ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
Given two sets of points P and Q, a group nearest neighbor (GNN) query retrieves the point(s) of P with the smallest sum of distances to all points in Q. Consider, for instance, three users at locations q1, q2 and q3 that want to find a meeting point (e.g., a restaurant); the corresponding query returns the data point p that minimizes the sum of Euclidean distances pqi  for 1≤i≤3. Assuming that Q fits in memory and P is indexed by an Rtree, we propose several algorithms for finding the group nearest neighbors efficiently. As a second step, we extend our techniques for situations where Q cannot fit in memory, covering both indexed and nonindexed query points. An experimental evaluation identifies the best alternative based on the data and query properties. 1.