Results 1  10
of
8,779
The Xtree: An index structure for highdimensional data
 In Proceedings of the Intâ€™l Conference on Very Large Data Bases
, 1996
"... In this paper, we propose a new method for indexing large amounts of point and spatial data in highdimensional space. An analysis shows that index structures such as the R*tree are not adequate for indexing highdimensional data sets. The major problem of Rtreebased index structures is the over ..."
Abstract

Cited by 592 (17 self)
 Add to MetaCart
In this paper, we propose a new method for indexing large amounts of point and spatial data in highdimensional space. An analysis shows that index structures such as the R*tree are not adequate for indexing highdimensional data sets. The major problem of Rtreebased index structures
Probabilistic Latent Semantic Indexing
, 1999
"... Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized ..."
Abstract

Cited by 1225 (10 self)
 Add to MetaCart
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized
Rtrees: A Dynamic Index Structure for Spatial Searching
 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA
, 1984
"... In order to handle spatial data efficiently, as required in computer aided design and geodata applications, a database system needs an index mechanism that will help it retrieve data items quickly according to their spatial locations However, traditional indexing methods are not well suited to data ..."
Abstract

Cited by 2750 (0 self)
 Add to MetaCart
to data objects of nonzero size located m multidimensional spaces In this paper we describe a dynamic index structure called an Rtree which meets this need, and give algorithms for searching and updating it. We present the results of a series of tests which indicate that the structure performs well
Fastmap: A fast algorithm for indexing, datamining and visualization of traditional and multimedia datasets
, 1995
"... A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in kd space, using k featureextraction functions, provided by a domain expert [Jag91]. Thus, we can subsequently use highly finetuned spatial access methods (SAMs), to answer several ..."
Abstract

Cited by 502 (22 self)
 Add to MetaCart
easier for a domain expert to assess the similarity/distance of two objects. Given only the distance information though, it is not obvious how to map objects into points. This is exactly the topic of this paper. We describe a fast algorithm to map objects into points in some kdimensional space (k
The SRtree: An Index Structure for HighDimensional Nearest Neighbor Queries
, 1997
"... Recently, similarity queries on feature vectors have been widely used to perform contentbased retrieval of images. To apply this technique to large databases, it is required to develop multidimensional index structures supporting nearest neighbor queries e ciently. The SStree had been proposed for ..."
Abstract

Cited by 438 (3 self)
 Add to MetaCart
volume than bounding rectangles with highdimensional data and that this reduces search efficiency. To overcome this drawback, we propose a new index structure called the SRtree (Sphere/Rectangletree) which integrates bounding spheres and bounding rectangles. A region of the SRtree is specified
On Effective MultiDimensional Indexing for Strings
 In SIGMOD
, 2000
"... As databases have expanded in scope from storing purely business data to include XML documents, product catalogs, email messages, and directory data, it has become increasingly important to search databases based on wildcard string matching: prefix matching, for example, is more common (and useful ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
useful) than exact matching, for such data. In many cases, matches need to be on multiple attributes/dimensions, with correlations between the dimensions. Traditional multidimensional index structures, designed with (fixed length) numeric data in mind, are not suitable for matching unbounded length
Efficient similarity search in sequence databases
, 1994
"... We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation being that, for most sequences of practical interest, only the first few frequencies are strong. Anot ..."
Abstract

Cited by 515 (19 self)
 Add to MetaCart
. Another important observation is Parseval's theorem, which specifies that the Fourier transform preserves the Euclidean distance in the time or frequency domain. Having thus mapped sequences to a lowerdimensionality space by using only the first few Fourier coe cients, we use Rtrees to index
Similarity search in high dimensions via hashing
, 1999
"... The nearest or nearneighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over highdimensional data, e.g., image dat ..."
Abstract

Cited by 641 (10 self)
 Add to MetaCart
The nearest or nearneighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over highdimensional data, e.g., image
Fast subsequence matching in timeseries databases
 PROCEEDINGS OF THE 1994 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA
, 1994
"... We present an efficient indexing method to locate 1dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance. The idea is to map each data sequence into a small set of multidimensional rectangles in feature space ..."
Abstract

Cited by 533 (24 self)
 Add to MetaCart
We present an efficient indexing method to locate 1dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specified tolerance. The idea is to map each data sequence into a small set of multidimensional rectangles in feature
When Is "Nearest Neighbor" Meaningful?
 In Int. Conf. on Database Theory
, 1999
"... . We explore the effect of dimensionality on the "nearest neighbor " problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance ..."
Abstract

Cited by 408 (2 self)
 Add to MetaCart
the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 1015 dimensions. These results should not be interpreted to mean that highdimensional indexing is never
Results 1  10
of
8,779