Results 1 - 10
of
29
Spatial Data Structures
, 1995
"... An overview is presented of the use of spatial data structures in spatial databases. The focus is on hierarchical data structures, including a number of variants of quadtrees, which sort the data with respect to the space occupied by it. Suchtechniques are known as spatial indexing methods. Hierarch ..."
Abstract
-
Cited by 273 (13 self)
- Add to MetaCart
An overview is presented of the use of spatial data structures in spatial databases. The focus is on hierarchical data structures, including a number of variants of quadtrees, which sort the data with respect to the space occupied by it. Suchtechniques are known as spatial indexing methods. Hierarchical data structures are based on the principle of recursive decomposition. They are attractive because they are compact and depending on the nature of the data they save space as well as time and also facilitate operations such as search. Examples are given of the use of these data structures in the representation of different data types such as regions, points, rectangles, lines, and volumes.
PK-TREE: A SPATIAL INDEX STRUCTURE FOR HIGH DIMENSIONAL POINT DATA
"... In this chapter we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
In this chapter we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the point distribution and we show that by eliminating these nodes the performance of the resulting index is robust to skewed data distributions. The index structure is formally defined, efficiently updatable and bounds on the number of nodes and the mean height of the tree can be proved. Bounds on the expected height of the tree can be given under certain mild constraints on the spatial distribution of points. Empirical evidence both on real data sets and generated data sets shows that the PK-tree outperforms the recently proposed spatial indexes based on the R-tree such as the SR-tree and X-tree by a wide margin. It is also significant that the relative performance advantage of the PK-tree grows with the dimensionality of the data set.
Hashing by proximity to process duplicates in spatial databases
- In Proceedings of the 3rd International Conference on Information and Knowledge Management (CIKM
, 1994
"... In a spatial database, an object may extend arbitrarily in space. As a result, many spatial data structures (e.g., the quadtree, the cell tree, the R +-tree) represent an object by partitioning it into multiple, yet simple, pieces, each of which is stored separately inside the data structure. Many o ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
In a spatial database, an object may extend arbitrarily in space. As a result, many spatial data structures (e.g., the quadtree, the cell tree, the R +-tree) represent an object by partitioning it into multiple, yet simple, pieces, each of which is stored separately inside the data structure. Many operations on these data structures are likely to produce duplicate results because of the multiplicity of object pieces. A novel approach for duplicate processing based on proximity of spatial objects is presented. This is di erent from conventional duplicate elimination in database systems because, with spatial databases, di erent pieces of the same object can span multiple buckets of the underlying data structure. Example algorithms are presented to perform duplicate processing using proximity for a quadtree representation of line segments and arbitrary rectangles. The complexity of the algorithms is seen to depend on a geometric classi cation of di erent instances of the spatial objects. By using proximity and the spatial properties of the objects, the number of disk-I/O requests as well as the run-time storage during duplicate processing can be reduced. 1
An overview of the SAND Spatial Database System
- Communications of the ACM
, 2001
"... An overview is given of the SAND spatial database system, an environment for developing applications involving both spatial and non-spatial data. The SAND kernel implements a relational data model extended with several geometric functions and predicates as well as a spatial index. The main interfa ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
An overview is given of the SAND spatial database system, an environment for developing applications involving both spatial and non-spatial data. The SAND kernel implements a relational data model extended with several geometric functions and predicates as well as a spatial index. The main interface to SAND is through an embedded interpreted language. This permits the rapid prototyping of algorithms and makes SAND a useful tool both for applications and research. A graphical user interface that allows for easy database querying, and a client/server approach that simplifies remote access are also outlined.
Adaptive context features for toponym resolution in streaming news
- In SIGIR’12: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2012
"... News sources around the world generate constant streams of information, but effective streaming news retrieval requires an intimate understanding of the geographic content of news. This process of understanding, known as geotagging, consists of first finding words in article text that correspond to ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
News sources around the world generate constant streams of information, but effective streaming news retrieval requires an intimate understanding of the geographic content of news. This process of understanding, known as geotagging, consists of first finding words in article text that correspond to location names (toponyms), and second, assigning each toponym its correct lat/long values. The latter step, called toponym resolution, can also be considered a classification problem, where each of the possible interpretations for each toponym is classified as correct or incorrect. Hence, techniques from supervised machine learning can be applied to improve accuracy. New classification features to improve toponym resolution, termed adaptive context features, are introduced that consider a window of context around each toponym, and use geographic attributes of toponyms in the window to aid in their correct resolution. Adaptive parameters controlling the window’s breadth and depth afford flexibility in managing a tradeoff between feature computation speed and resolution accuracy, allowing the features to potentially apply to a variety of textual domains. Extensive experiments with three large datasets of streaming news demonstrate the new features ’ effectiveness over two widely-used competing methods. Categories andSubjectDescriptors
An Optimal Resolution Sensitive Pyramid Representation for Hierarchical Memory Models
- Journal of Computing and Information
, 1994
"... In this paper we propose and analyze a new hierarchical direct access data structure, namely the up-down pyramid, for storing and processing efficiently large sets of 2-dimensional data. We use as performance measure the cost of storage accesses on a hierrchical memory model with different access co ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper we propose and analyze a new hierarchical direct access data structure, namely the up-down pyramid, for storing and processing efficiently large sets of 2-dimensional data. We use as performance measure the cost of storage accesses on a hierrchical memory model with different access cost functions of theoretical and practical significance. We analyze, for our structure, the time complexity of the operation of retrieving the whole information associated to a datum, and prove that it is dependent only on the accessed resolution level without any overhead costs. Therefore, the up-down pyramid is an optmal representation for the considered model, with respect to other proposed direct access hierarchical structures. Namely we prove that, given a set of 2-dimensional data of size T×T, the up-down pyramid guarantees retrieving of information associated to location x in optimal time O(f(x)), where f(x) is the considered cost function in the hierarchical memory model, while in the...
A performance comparison of quadtree-based access methods for thematic maps
, 2000
"... In this paper, the efficient manipulation of thematic maps that contain multiple non-overlapping features is investi-gated. New methods based on Linear quadtrees are pro-posed and their performance is compared to that of similar structures. More specifically, window queries involving mul-tiple featu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, the efficient manipulation of thematic maps that contain multiple non-overlapping features is investi-gated. New methods based on Linear quadtrees are pro-posed and their performance is compared to that of similar structures. More specifically, window queries involving mul-tiple features are described and tested having the number of disk accesses as a performance measure. Experimentally, it is shown that the proposed methods have a stable behav-ior and, in general, outperform the previous structures with respect to time and space complexity. Keywords spatial databases, region quadtrees, multiple features, su-perimposed bitstrings, window queries 1.
Object-Based and Image-Based Object Representations
- ACM Computing Surveys
, 2004
"... An overview is presented of object-based and image-based representations of objects by their interiors. The representations are distinguished by the manner in which they can be used to answer two fundamental queries in database applications: (1) Feature query: given an object, determine its constitu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
An overview is presented of object-based and image-based representations of objects by their interiors. The representations are distinguished by the manner in which they can be used to answer two fundamental queries in database applications: (1) Feature query: given an object, determine its constituent cells (i.e., their locations in space). (2) Location query: given a cell (i.e., a location in space), determine the identity of the object (or objects) of which it is a member as well as the remaining constituent cells of the object (or objects). Regardless of the representation that is used, the generation of responses to the feature and location queries is facilitated by building an index (i.e., the result of a sort) either on the objects or on their locations in space, and implementing it using an access structure that correlates the objects with the locations. Assuming the presence of an access structure, implicit (i.e., image-based) representations are described that are good for finding the objects associated with a particular location or cell (i.e., the location query), while requiring that all cells be examined when determining the locations associated with a particular object (i.e., the feature query). In contrast, explicit (i.e., object-based) representations are good for the feature query,
The NewCasper: Query Processing for Location Services without Compromising Privacy
"... This paper tackles a major privacy concern in current location-based services where users have to continuously report their locations to the database server in order to obtain the service. For example, a user asking about the nearest gas station has to report her exact location. With untrusted serve ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper tackles a major privacy concern in current location-based services where users have to continuously report their locations to the database server in order to obtain the service. For example, a user asking about the nearest gas station has to report her exact location. With untrusted servers, reporting the location information may lead to several privacy threats. In this paper, we present Casper 1; a new framework in which mobile and stationary users can entertain location-based services without revealing their location information. Casper consists of two main components, the location anonymizer and the privacy-aware query processor. The location anonymizer blurs the users ’ exact location information into cloaked spatial regions based on userspecified privacy requirements. The privacy-aware query processor is embedded inside the location-based database server in order to deal with the cloaked spatial areas rather than the exact location information. Experimental results show that Casper achieves high quality location-based services while providing anonymity for both data and queries. 1.
Identification of live news events using Twitter
- In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on LocationBased Social Networks (LBSN '11
, 2011
"... Twitter presents a source of information that cannot easily be obtained anywhere else. However, though many posts on Twitter reveal up-to-the-minute information about events in the world or interesting sentiments, far more posts are of no interest to the general audience. A method to determine which ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Twitter presents a source of information that cannot easily be obtained anywhere else. However, though many posts on Twitter reveal up-to-the-minute information about events in the world or interesting sentiments, far more posts are of no interest to the general audience. A method to determine which Twitter users are posting reliable information and which posts are interesting is presented. Using this information a search through a large, online news corpus is conducted to discover future events before they occur along with information about the location of the event. These events can be identified with a high degree of accuracy by verifying that an event found in one news article is found in other similar news articles, since any event interesting to a general audience will likely have more than one news story written about it. Twitter posts near the time of the event can then be identified as interesting if they match the event in terms of keywords or location. This method enables the discovery of interesting posts about current and future events and helps in the identification of reliable users.

