Results 11 - 20
of
31
Benchmarking Spatial Join Operations with Spatial Output
- Proceedings of the 21st International Conference on Very Large Data Bases
, 1998
"... The spatial join operation is benchmarked using variants of well-known spatial data structures such as the R-tree, R -tree, R + -tree, and the PMR quadtree. The focus is on a spatial join with spatial output because the result of the spatial join frequently serves as input to subsequent spatial ..."
Abstract
-
Cited by 27 (6 self)
- Add to MetaCart
The spatial join operation is benchmarked using variants of well-known spatial data structures such as the R-tree, R -tree, R + -tree, and the PMR quadtree. The focus is on a spatial join with spatial output because the result of the spatial join frequently serves as input to subsequent spatial operations (i.e., a cascaded spatial join as would be common in a spatial spreadsheet). Thus, in addition to the time required to perform the spatial join itself (whose output is not always required to be spatial), the time to build the spatial data structure also plays an important role in the benchmark. The studied quantities are the time to build the data structure and the time to do the spatial join in an application domain consisting of planar line segment data. Experiments reveal that spatial data structures based on a disjoint decomposition of space and bounding boxes (i.e., the R + -tree and the PMR quadtree with bounding boxes) outperform the other structures that are based upon ...
A Performance Evaluation of Spatial Join Processing Strategies
"... Performing a fair comparison of the various spatial join techniques that have been proposed in the past decade is a challenging task, since they have been validated through experimental evaluation with no common methodology, on different platforms, with various datasets and implementation choices. I ..."
Abstract
-
Cited by 27 (6 self)
- Add to MetaCart
Performing a fair comparison of the various spatial join techniques that have been proposed in the past decade is a challenging task, since they have been validated through experimental evaluation with no common methodology, on different platforms, with various datasets and implementation choices. It is then even more difficult to provide guidelines for generating optimal plans for complex spatial queries involving several spatial joins (multi-way joins). The objective of this paper is two fold: (i) to propose a common framework and evaluation platform for spatial query processing, and (ii) to use it to experimentally evaluate spatial join processing strategies. We provide a first evaluation of query execution plans (QEP) in the case of queries with one or two spatial joins. The QEPs assume R -tree indexed relations and use a common set of spatial joins algorithms, among which one is a novel extension of a strategy based on an on-the-fly index creation prior to the join with another...
Lightweight Structured Text Processing
- In Proc. of USENIX 1999 Annual Technical Conference
, 1999
"... Text is a popular storage and distribution format for information, partly due to generic text-processing tools like Unix grep and sort. Unfortunately, existing generic tools make assumptions about text format (e.g., each line is a record) that limit their applicability. Custom-built tools are one al ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
Text is a popular storage and distribution format for information, partly due to generic text-processing tools like Unix grep and sort. Unfortunately, existing generic tools make assumptions about text format (e.g., each line is a record) that limit their applicability. Custom-built tools are one alternative, but they require substantial time investment and programming expertise. We describe a new approach, lightweight structured text processing, which overcomes these difficulties by enabling users to define text structure interactively and manipulate the structure with generic tools. Our prototype system, LAPIS, is a web browser that can highlight, filter, and sort text regions described by the user. LAPIS has several advantages over other systems: (1) the ability to define custom structure with a simple, intuitive pattern language; (2) interactive specification, showing pattern matches in context and letting users choose the most convenient combination of manual selection and patter...
The 3W Model and Algebra for Unified Data Mining
, 2000
"... Real data mining/analysis applications call for a framework which adequately supports knowledge discovery as a multi-step process, where the input of one mining operation can be the output of another. Previous studies, primarily focusing on fast computation of one specific mining task at a tim ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Real data mining/analysis applications call for a framework which adequately supports knowledge discovery as a multi-step process, where the input of one mining operation can be the output of another. Previous studies, primarily focusing on fast computation of one specific mining task at a time, ignore this vital issue. Motivated by
Rapid and Accurate Contact Determination between Spline Models using ShellTrees
, 1998
"... In this paper, we present an efficient algorithm for contact determination between spline models. We make use of a new hierarchy, called ShellTree, that comprises of spherical shells and oriented bounding boxes. Each spherical shell corresponds to a portion of the volume between two concentric spher ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
In this paper, we present an efficient algorithm for contact determination between spline models. We make use of a new hierarchy, called ShellTree, that comprises of spherical shells and oriented bounding boxes. Each spherical shell corresponds to a portion of the volume between two concentric spheres. Given large spline models, our algorithm decomposes each surface into Bezier patches as part of preprocessing. At runtime it dynamically computes a tight fitting axis-aligned bounding box across each Bezier patch and efficiently checks all such boxes for overlap. Using off-line and on-line techniques for tree construction, our algorithm computes ShellTrees for Bezier patches and performs fast overlap tests between them to detect collisions. The overall approach can trade off runtime performance for reduced memory requirements. We have implemented the algorithm and tested itonlarge models, each composed of hundred ofpatches. Its performance varies with the configurations of the objects. For many complex models composed of hundreds of patches, it can accurately compute the contacts in a few milliseconds.
GBI: A Generalized R-Tree Bulk-Insertion Strategy
- In Proceedings of International Symposium on Spatial Databases
, 1999
"... . A lot of recent work has studied strategies related to bulk loading of large data sets into multidimensional index structures. In this paper, we address the problem of bulk insertions into existing index structures with particular focus on R-trees -- which are an important class of index structure ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
. A lot of recent work has studied strategies related to bulk loading of large data sets into multidimensional index structures. In this paper, we address the problem of bulk insertions into existing index structures with particular focus on R-trees -- which are an important class of index structures used widely in commercial database systems. We propose a new technique, which as opposed to the current technique of inserting data one by one, bulk inserts entire new incoming datasets into an active R-tree. This technique, called GBI (for Generalized Bulk Insertion) , partitions the new datasets into sets of clusters and outliers, constructs an R-tree (small tree) from each cluster, identifies and prepares suitable locations in the original R-tree (large tree) for insertion, and lastly performs the insertions of the small trees and the outliers into the large tree in bulk. Our experimental studies demonstrate that GBI does especially well (over 200% better than the existing technique) fo...
Dynamic Granular Locking Approach to Phantom Protection in R-trees
- In Proc. of ICDE
, 1998
"... Over the last decade, the R-tree has emerged as one of the most robust multidimensional access methods. However, before the R-tree can be integrated as an access method to a commercial strength database management system, e#cient techniques to provide transactional access to data via R-trees need to ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Over the last decade, the R-tree has emerged as one of the most robust multidimensional access methods. However, before the R-tree can be integrated as an access method to a commercial strength database management system, e#cient techniques to provide transactional access to data via R-trees need to be developed. Concurrent access to data through a multidimensional data structure introduces the problem of protecting ranges speci#ed in the retrieval from phantom insertions and deletions #the phantom problem#. Existing approaches to phantom protection in B-trees #namely, key-range locking# cannot be applied to multidimensional data structures since they rely on a total order over the key space on which the B-tree is designed. This paper presents a dynamic granular locking approach to phantom protection in R-trees. To the best of our knowledge, this paper provides the #rst solution to the phantom problem in multidimensional access methods basedongranular locking. 1 Introduction Over the...
R-Tree Index Optimization
- In Sixth International Symposium on Spatial Data Handling
, 1994
"... The optimization of spatial indexing is an increasingly important issue considering the fact that spatial databases, in suchdiverse areas as geographical, CAD/CAM and image applications, are growing rapidly in size and often contain on the order of millions of items or more. This necessitates the st ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
The optimization of spatial indexing is an increasingly important issue considering the fact that spatial databases, in suchdiverse areas as geographical, CAD/CAM and image applications, are growing rapidly in size and often contain on the order of millions of items or more. This necessitates the storage of the index on disk, which has the potential of slowing down the access time significantly. In this paper, we discuss ways of minimizing the disk access frequency by grouping together data items which are close to one another in the spatial domain ("packing"). The data structure which we seek to optimize here is the R-tree for a given set of data objects.
Searching Near-Replicas of Images via Clustering
- Proc. SPIE Symposium of Voice, Video, and Data Communications
, 1999
"... Internet piracy has been one of the major concerns for Web publishing. In this study we present a system, RIME, that we have prototyped for detecting unauthorized image copying on the World-Wide Web. To speed up the copy detection, RIME uses a new clustering/hashing approach that first clusters simi ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Internet piracy has been one of the major concerns for Web publishing. In this study we present a system, RIME, that we have prototyped for detecting unauthorized image copying on the World-Wide Web. To speed up the copy detection, RIME uses a new clustering/hashing approach that first clusters similar images on adjacent disk cylinders and then builds indexes to access the clusters made in this way. Searching for the replicas of an image often takes just one IO to look up the location of the cluster containing similar objects and one sequential file IO to read in this cluster. Our experimental results show that RIME can detect image copies both more efficiently and effectively than the traditional content-based image retrieval systems that use tree-like structures to index images. In addition, RIME copes well with image format conversion, resampling, requantization and geometric transformations. Keywords: clustering, copy detection, multidimensional indexes, similarity search. 1 Intro...
A Feature-Based Image Retrieval Database for the Fashion, Textile, and Clothing Industry in Hong Kong
"... We present an image database system for the fashion, textile, and clothing industry in Hong Kong supporting featurebased retrieval by color histogram, color sketch, shape, and texture. The Query-by-color-histogram method uses feature vectors extracted from the color distribution of an image for retr ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We present an image database system for the fashion, textile, and clothing industry in Hong Kong supporting featurebased retrieval by color histogram, color sketch, shape, and texture. The Query-by-color-histogram method uses feature vectors extracted from the color distribution of an image for retrieval. The Query-by-color-sketch method makes use of the regionalized color information of the image for retrieval. The Query-by-shape method uses a two-stage polygon representation for shape searching. The Query-by-texture method uses statistical texture analysis to compute textural features for texture matching. One important feature of our system is the Open Architecture design. This design allows the system to be extensible, maintainable,andflexible. There are two aspects of this open architecture: (1) Open DataBase Connectivity (ODBC) and (2) plug-in framework. Moreover, we describe a server/client system design enabling internet access to the system. Based on our system design, we demo...

