Results 1 -
2 of
2
On the Performance of Object Clustering Techniques
"... We investigate the performance of some of the best-known object clustering algorithms on four different workloads based upon the Tektronix benchmark. For all four workloads, stochastic clustering gave the best performance for a variety of performance metrics. Since stochastic clustering is computati ..."
Abstract
-
Cited by 65 (0 self)
- Add to MetaCart
We investigate the performance of some of the best-known object clustering algorithms on four different workloads based upon the Tektronix benchmark. For all four workloads, stochastic clustering gave the best performance for a variety of performance metrics. Since stochastic clustering is computationally expensive, it is interesting that for every workload there was at least one cheaper clustering algorithm that matched or almost matched stochastic clustering. Unfortunately, for each workload, the algorithm that approximated stochastic clustering was different. Our experiments also demonstrated that even when the workload and object graph are fixed, the choice of the clustering algorithm depends upon the goals of the system. For example, if the goal is to perform well on traversals of small portions of the database starting with a cold cache, the important metric is the per-traversal expansion factor, and a well-chosen placement tree will be nearly optimal; if the goal is to achieve a...
Clustering techniques for minimizing external path length
- Proceedings of the International Conference on Very Large Databases
, 1996
"... There are a variety of main-memory access structures, such as segment trees, and quad trees, whose properties, such as good worst-case behaviour, make them attractive for database applicdions. Unfortunately, the structures are typically ‘long and skinny’, whereas disk data structuies must be ‘short- ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
There are a variety of main-memory access structures, such as segment trees, and quad trees, whose properties, such as good worst-case behaviour, make them attractive for database applicdions. Unfortunately, the structures are typically ‘long and skinny’, whereas disk data structuies must be ‘short-and-fat (that is, have a high fanout and low height) in order to minimize I/O. We consider how to cluster the nodes (that is, map the nodes to disk pages) of main-memory access structures such that although a path may traverse many nodes, it only tra-verses a few disk pages. The number of disk pages traversed in a path is called the exter-nal path length. We address several versions of the clustering problem. We present a clus-tering algorithm for tree structures that gener-ates optimal worst-case external path length mappings; we also show how to make it dy-namic, to support updates. We extend the al-gorithm to generate mappings that minimize the average weighted external path lengths. We also show that some other clustering prob-lems, such as finding optimal external path lengths for DAG structures and minimizing

