Results 1 -
3 of
3
Compressing relations and indexes
- In proceedings of IEEE International Conference on Data Engineering
, 1998
"... We propose a new compression algorithm that is tailored to database applications. It can be applied to a collection of records, and is especially e ective for records with many low to medium cardinality elds and numeric elds. In addition, this new technique supports very fast decompression. Promisin ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
We propose a new compression algorithm that is tailored to database applications. It can be applied to a collection of records, and is especially e ective for records with many low to medium cardinality elds and numeric elds. In addition, this new technique supports very fast decompression. Promising application domains include decision support systems (DSS), since \fact tables", which are by far the largest tables in these applications, contain many low and medium cardinality elds and typically no text elds. Further, our decompression rates are faster than typical disk throughputs for sequential scans � in contrast, gzip is slower. This is important in DSS applications, which often scan large ranges of records. An important distinguishing characteristic of our algorithm, in contrasttocompression algorithms proposed earlier, is that we can decompress individual tuples (even individual elds), rather than a full page (or an entire relation) at a time. Also, all the information needed for tuple decompression resides on the same page with the tuple. This means that a page can be stored in the bu er pool and used in compressed form, simplifying the job of the bu er manager and improving memory utilization. Our compression algorithm also improves index structures such as B-trees and R-trees signi cantly by reducing the number of leaf pages and compressing index entries, which greatly increases the fan-out. We can also use lossy compression on the internal nodes of an index. 1
Tree-Based Indexes for Image Data
- Journal of Visual Communication and Image Representation, Volume 9, Number
, 1998
"... As in conventional DataBase Management Systems (DBMSs), to allow users to efficiently access and retrieve data objects, a MultiMedia DataBase Management System (MMDBMS) must employ an effective access method such as indexing and hashing. This paper provides a survey of treebased multidimensional ind ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
As in conventional DataBase Management Systems (DBMSs), to allow users to efficiently access and retrieve data objects, a MultiMedia DataBase Management System (MMDBMS) must employ an effective access method such as indexing and hashing. This paper provides a survey of treebased multidimensional indexing techniques for MMDBMSs that maintain image data represented as feature vectors. These techniques support such data while maintaining desirable characteristics of a Btree, an index structure most commonly used in traditional DBMSs. In this survey, we provide descriptions of each tree as well as give examples of the different data organization schemes. We also describe the advantages and disadvantages of using each technique. In addition, we provide classifications of the trees using several different properties. These classifications should assist researchers in identifying the strengths and weaknesses of any new indexing technique they develop as well as help users determine the most appropriate data structure for their applications. 1.
Using Constraints to Query R*-Trees
, 1996
"... The R -Tree index is a popular multidimensional index used in several extensible and GIS-oriented database systems. In this paper, we show that a simple refinement of the search algorithm of the R -Tree---which is common to all variants of the R-Tree---offers significant speedups in most case ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
The R -Tree index is a popular multidimensional index used in several extensible and GIS-oriented database systems. In this paper, we show that a simple refinement of the search algorithm of the R -Tree---which is common to all variants of the R-Tree---offers significant speedups in most cases, with little or no worst-case performance penalty. The idea is essentially to use a conjunction of linear constraints (rather than a minimum bounding retangle) to approximate the query and to use this tighter bounding envelope to determine when the query overlaps with an R -Tree node. This raises an important question: How can we efficiently check whether the query envelope overlaps the minimum bounding box for a tree node? Linear Programming (LP) offers one solution, but it is susceptible to numeric approximation errors. One of the contributions of this paper is a new algorithm for performing this check check that is more efficient than LP and free from numeric errors. We also present...

