Results 1  10
of
37
Multidimensional Access Methods
, 1998
"... Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that ..."
Abstract

Cited by 561 (3 self)
 Add to MetaCart
Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that overlap a given search region). More
Fast Subsequence Matching in TimeSeries Databases
 SIGMOD 94
, 1994
"... We present an efficient indexing method to locate 1dimensional subsequences witbin a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance. The idea is to map each data sequence into a small set of multidimensional rectangles in feature space ..."
Abstract

Cited by 430 (21 self)
 Add to MetaCart
We present an efficient indexing method to locate 1dimensional subsequences witbin a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance. The idea is to map each data sequence into a small set of multidimensional rectangles in feature space. Then, these rectangles can be readily indexed using traditional spatial access methods, like the R*tree [9]. In more deteil, we use a sliding window over the data sequence and extract its features; the result is a trail in feature space. We propose an efficient and effective algorithm to divide such trails into subtrails, which are subsequently represented by their Minimum Bounding Rectangles (MBRs). We also examine queries of varying lengths, and we show how to handle each case efficiently. We implemented our method and carried out experiments on synthetic and real data (stock price movements). We compared the method to sequential scanning, which is the only obvious competitor. The results were excellent: our method accelerated the search time from 3 times up to 100 times.
Efficient and Effective Querying by Image Content
 Journal of Intelligent Information Systems
, 1994
"... In the QBIC (Query By Image Content) project we are studying methods to query large online image databases using the images' content as the basis of the queries. Examples of the content we use include color, texture, and shape of image objects and regions. Potential applications include medical ..."
Abstract

Cited by 429 (12 self)
 Add to MetaCart
In the QBIC (Query By Image Content) project we are studying methods to query large online image databases using the images' content as the basis of the queries. Examples of the content we use include color, texture, and shape of image objects and regions. Potential applications include medical ("Give me other images that contain a tumor with a texture like this one"), photojournalism ("Give me images that have blue at the top and red at the bottom"), and many others in art, fashion, cataloging, retailing, and industry. We describe a set of novel features and similarity measures allowing query by color, texture, and shape of image object. We demonstrate the effectiveness of the QBIC system with normalized precision and recall experiments on test databases containing over 1000 images and 1000 objects populated from commercially available photo clip art images, and of images of airplane silhouettes. We also consider the efficient indexing of these features, specifically addre...
Efficient similarity search in sequence databases
, 1994
"... We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation being that, for most sequences of practical interest, only the first few frequencies are strong. Anot ..."
Abstract

Cited by 415 (20 self)
 Add to MetaCart
We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation being that, for most sequences of practical interest, only the first few frequencies are strong. Another important observation is Parseval's theorem, which specifies that the Fourier transform preserves the Euclidean distance in the time or frequency domain. Having thus mapped sequences to a lowerdimensionality space by using only the first few Fourier coe cients, we use Rtrees to index the sequences and e ciently answer similarity queries. We provide experimental results which show that our method is superior to search based on sequential scanning. Our experiments show that a few coefficients (13) are adequate to provide good performance. The performance gain of our method increases with the number and length of sequences.
On Packing Rtrees
 In ACM CIKM
, 1993
"... – main idea; file structure – algorithms: insertion/split – deletion – search: range, nn, spatial joins – performance analysis – variations (packed; hilbert;...) 15721 Copyright: C. Faloutsos (2001) 2 Problem • Given a collection of geometric objects (points, lines, polygons,...) • organize them on ..."
Abstract

Cited by 220 (16 self)
 Add to MetaCart
– main idea; file structure – algorithms: insertion/split – deletion – search: range, nn, spatial joins – performance analysis – variations (packed; hilbert;...) 15721 Copyright: C. Faloutsos (2001) 2 Problem • Given a collection of geometric objects (points, lines, polygons,...) • organize them on disk, to answer spatial queries (range, nn, etc) 15721 Copyright: C. Faloutsos (2001) 3 1 (Who cares?)
The TVtree  an index structure for highdimensional data
 VLDB Journal
, 1994
"... We propose a file structure to index highdimensionality data, typically, points in some feature space. The idea is to use only a few of the features, utilizing additional features whenever the additional discriminatory power is absolutely necessary. We present in detail the design of our tree struc ..."
Abstract

Cited by 193 (7 self)
 Add to MetaCart
We propose a file structure to index highdimensionality data, typically, points in some feature space. The idea is to use only a few of the features, utilizing additional features whenever the additional discriminatory power is absolutely necessary. We present in detail the design of our tree structure and the associated algorithms that handle such `varying length' feature vectors. Finally we report simulation results, comparing the proposed structure with the R tree, which is one of the most successful methods for lowdimensionality spaces. The results illustrate the superiority of our method, with up to 80% savings in disk accesses. Type of Contribution: New Index Structure, for highdimensionality feature spaces. Algorithms and performance measurements. Keywords: Spatial Index, Similarity Retrieval, Query by Content 1 Introduction Many applications require enhanced indexing, capable of performing similarity searching on several, nontraditional (`exotic') data types. The targ...
Hilbert Rtree: An improved Rtree using fractals
, 1994
"... We propose a new Rtree structure that outperforms all the older ones. The heart of the idea is to facilitate the deferred splitting approach in Rtrees. This is done by proposing an ordering on the Rtree nodes. This ordering has to be 'good', in the sense that it should group 'similar' data rectan ..."
Abstract

Cited by 184 (9 self)
 Add to MetaCart
We propose a new Rtree structure that outperforms all the older ones. The heart of the idea is to facilitate the deferred splitting approach in Rtrees. This is done by proposing an ordering on the Rtree nodes. This ordering has to be 'good', in the sense that it should group 'similar' data rectangles together, to minimize the area and perimeter of the resulting minimum bounding rectangles (MBRs). Following [19] we have chosen the socalled '2Dc' method, which sorts rectangles according to the Hilbert value of the center of the rectangles. Given the ordering, every node has a welldefined set of sibling nodes; thus, we can use deferred splitting. By adjusting the split policy, the Hilbert Rtree can achieve as high utilization as desired. To the contrary, the R tree has no control over the space utilization, typically achieving up to 70%. We designed the manipulation algorithms in detail, and we did a full implementation of the Hilbert Rtree. Our experiments show that the '2to...
Beyond uniformity and independence: Analysis of rtrees using the concept of fractal dimension
 In Proc. PODS
, 1994
"... We propose the concept of fractal dimension of a set of points, in order to quantify the deviation from the uniformity distribution. Using measurements on real data sets (road intersections of U.S. counties, star coordinates from NASA’s InfraredUltraviolet Explorer etc.) we provide evidence that re ..."
Abstract

Cited by 154 (19 self)
 Add to MetaCart
We propose the concept of fractal dimension of a set of points, in order to quantify the deviation from the uniformity distribution. Using measurements on real data sets (road intersections of U.S. counties, star coordinates from NASA’s InfraredUltraviolet Explorer etc.) we provide evidence that real data indeed are skewed, and, moreover, we show that they behave as mathematical fractals, with a measurable, noninteger fract al dimension. Armed with this tool, we then show its practical use in predicting the performance of spatial access methods, and specifically of the Rtrees. We provide the jirst analysis of Rtrees for skewed distributions of points: We develop a formula that estimates the number of disk accesses for range queries, given only the fractal dimension of the point set, and its count. Experiments on real data sets show that the formula is very accurate: the relative error is usually below 5%, and it rarely exceeds 10%. We believe that the fractal dimension will help replace the uniformity and independence assumptions, allowing more accurate analysis for any spatial access method, as well as better estimates for query optimization on multiattribute queries. 1
Analysis of the clustering properties of the Hilbert spacefilling curve
 IEEE Transactions on Knowledge and Data Engineering
, 2001
"... AbstractÐSeveral schemes for the linear mapping of a multidimensional space have been proposed for various applications, such as access methods for spatiotemporal databases and image compression. In these applications, one of the most desired properties from such linear mappings is clustering, whic ..."
Abstract

Cited by 141 (10 self)
 Add to MetaCart
AbstractÐSeveral schemes for the linear mapping of a multidimensional space have been proposed for various applications, such as access methods for spatiotemporal databases and image compression. In these applications, one of the most desired properties from such linear mappings is clustering, which means the locality between objects in the multidimensional space being preserved in the linear space. It is widely believed that the Hilbert spacefilling curve achieves the best clustering [1], [14]. In this paper, we analyze the clustering property of the Hilbert spacefilling curve by deriving closedform formulas for the number of clusters in a given query region of an arbitrary shape (e.g., polygons and polyhedra). Both the asymptotic solution for the general case and the exact solution for a special case generalize previous work [14]. They agree with the empirical results that the number of clusters depends on the hypersurface area of the query region and not on its hypervolume. We also show that the Hilbert curve achieves better clustering than the z curve. From a practical point of view, the formulas given in this paper provide a simple measure that can be used to predict the required disk access behaviors and, hence, the total access time.
MultiStep Processing of Spatial Joins
"... Spatial joins are one of the most importaot operations for combining spatial objects of several relations. IO this paper, spatial join processing is studied in detail for extended spatial objects in twodimensional data space. We present an approach for spatial join processing that is based on three ..."
Abstract

Cited by 136 (14 self)
 Add to MetaCart
Spatial joins are one of the most importaot operations for combining spatial objects of several relations. IO this paper, spatial join processing is studied in detail for extended spatial objects in twodimensional data space. We present an approach for spatial join processing that is based on three steps. First, a spatial join is performed on the minimum bounding rectangles of the objects returning a set of candidates. Various approaches for accelerating this step of join processing have been examined at the last year’s conference [BKS 93a]. In this paper, we focus on the problem how to compute the answers from the set of candidates which is handled by the foliowing two steps. First of all, sophisticated approximations are used to identify answers as well as to filter out false hits from the set of candidates. For this purpose, we investigate various types of conservative and progressive approximations. In the last step, the exact geometry of the remaioing candidates has to be tested against the join predicate. The time required for computing spatial joio predicates can essentially be reduced when objects are adequately organized in main memory. IO our approach, objects are fiist decomposed into simple components which are exclusively organized by a mainmemory resident spatial data structure. Overall, we present a complete approach of spatial join processing on complex spatial objects. The performance of the individual steps of our approach is evaluated with data sets from real cartographic applications. The results show that our approach reduces the total execution time of the spatial join by factors.