Results 1 - 10
of
10
Fully Dynamic Spatial Approximation Trees
- In Proceedings of the 9th International Symposium on String Processing and Information Retrieval (SPIRE 2002), LNCS 2476
, 2002
"... The Spatial Approximation Tree (sa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. Its main drawbacks are: costly construction ..."
Abstract
-
Cited by 22 (12 self)
- Add to MetaCart
The Spatial Approximation Tree (sa-tree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. Its main drawbacks are: costly construction time, poor performance in low dimensional spaces or queries with high selectivity, and the fact of being a static data structure, that is, once built, one cannot add or delete elements.
B-trees with Inserts and Deletes: Why Free-at-empty is Better Than Merge-at-half
- Journal of Computer and System Sciences
, 1992
"... The space utilization of B-tree nodes determines the number of levels in the B-tree and hence its performance. Until now, the only analytical aid to the determination of a B-tree's utilization has been the analysis by Yao and related work. Yao showed that the utilization of B-tree nodes under pure i ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
The space utilization of B-tree nodes determines the number of levels in the B-tree and hence its performance. Until now, the only analytical aid to the determination of a B-tree's utilization has been the analysis by Yao and related work. Yao showed that the utilization of B-tree nodes under pure inserts is 69%. We derive analytically and verify by simulation the utilization of B-tree nodes constructed from a mixture of insert and delete operations. Assuming that nodes only merge (i.e. are freed) when they are empty we show that the utilization is 39% when the number of inserts is the same as the number of deletes. However, if there are just 5% more inserts than deletes, then the utilization is over 62%. We also calculate the probability of splitting and merging. We derive a simple rule-of-thumb that accurately calculates the probability of splitting. We also model B-trees that merge half-empty nodes. The utilization of merge-at-half B-trees is slightly larger than the utilization of ...
The Performance of Concurrent Data Structure Algorithms
- Transactions on Database Systems
, 1994
"... This thesis develops a validated model of concurrent data structure algorithm performance, concentrating on concurrent B-trees. The thesis first develops two analytical tools, which are explained in the next two paragraphs, for the analysis. Yao showed that the space utilization of a B-tree built fr ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
This thesis develops a validated model of concurrent data structure algorithm performance, concentrating on concurrent B-trees. The thesis first develops two analytical tools, which are explained in the next two paragraphs, for the analysis. Yao showed that the space utilization of a B-tree built from random inserts is 69%. Assuming that nodes merge only when empty, we show that the utilization is 39% when the number of insert and delete operations is the same. However, if there are just 5% more inserts than deletes, then the utilization is at least 62%. In addition to the utilization, we calculate the probabilities of splitting and merging, important parameters for calculating concurrent B-tree algorithm performance. We compare merge-at-empty B-trees with merge-at-half B-trees. We conclude that merge-at-empty Btrees have a slightly lower space utilization but a much lower restructuring rate than merge-at-half B-trees, making merge-at-empty B-trees preferable for concurrent B-tree algo...
Fringe Analysis Revisited
"... Fringe analysis is a technique used to study the average behavior of search trees. In this paper we survey the main results regarding this technique, and we improve a previous asymptotic theorem. At the same time we present new developments and applications of the theory which allow improvements in ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
Fringe analysis is a technique used to study the average behavior of search trees. In this paper we survey the main results regarding this technique, and we improve a previous asymptotic theorem. At the same time we present new developments and applications of the theory which allow improvements in several bounds on the behavior of search trees. Our examples cover binary search trees, AVL trees, 2-3 trees, and B-trees. Categories and Subject Descriptors: F.2.2 [Analysis of Algorithms and Problem Complexity ]: Nonnumerical Algorithms and Problems -- computations on discrete structures; sorting and searching; E.1 [Data Structures]; trees. Contents 1 Introduction 2 2 The Theory of Fringe Analysis 4 3 Weakly Closed Collections 9 4 Including the Level Information 11 5 Fringe Analysis, Markov Chains, and Urn Processes 13 This work was partially funded by Research Grant FONDECYT 93-0765. e-mail: rbaeza@dcc.uchile.cl 1 Introduction Search trees are one of the most used data structures t...
A Multivariate View of Random Bucket Digital Search Trees
, 2002
"... We take a multivariate view of digital search trees by studying the number of nodes of di#erent types that may coexist in a bucket digital search tree as it grows under an arbitrary memory management system. We obtain the mean of each type of node, as well as the entire covariance matrix between ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
We take a multivariate view of digital search trees by studying the number of nodes of di#erent types that may coexist in a bucket digital search tree as it grows under an arbitrary memory management system. We obtain the mean of each type of node, as well as the entire covariance matrix between types, whereupon weak laws of large numbers follow from the orders of magnitude (the norming constants include oscillating functions). The result can be easily interpreted for practical systems like paging, heaps and UNIX's buddy system. The covariance results call for developing a Mellin convolution method, where convoluted numerical sequences are handled by convolutions of their Mellin transforms. Furthermore, we use a method of moments to show that the distribution is asymptotically normal. The method of proof is of some generality and is applicable to other parameters like path length and size in random tries and Patricia tries.
An Adaptive Overflow Technique for B-trees
- Extending Data Base Technology Conference (EDBT 90
, 1990
"... We present a new overflow technique for B-trees. The technique is a hybrid of partial expansions and unbalanced splits. This technique is asymmetric and adaptive. Considering a growing file (only insertions), the storage utilization is 77% for random keys, 70% for sorted keys, and over 75% for non-u ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present a new overflow technique for B-trees. The technique is a hybrid of partial expansions and unbalanced splits. This technique is asymmetric and adaptive. Considering a growing file (only insertions), the storage utilization is 77% for random keys, 70% for sorted keys, and over 75% for non-uniform distributed keys. Similar results are achieved when we have deletions mixed with insertions. One of the main properties of this technique is that the storage utilization is very stable with respect to changes of the data distribution. This technique may be used for other bucket-based file structures, like extendible hashing or bounded disorder files. 1 Introduction The B + -tree is one of the most widely used file organizations. In a B + -tree all the information is stored at the lowest level (buckets), and the upper levels are a B-tree index. File growth is handled by bucket splitting, that is, when a bucket overflows, a new bucket is allocated and half of the records from the o...
Bounded Disorder: The Effect of the Index
"... In this paper we complete the analysis done by Ramakrishna and Mukhopadhyay for a data node in the Bounded Disorder (BD) file organization of Litwin and Lomet, by introducing the B-tree index into the model. Also, we extend the analysis to the case of BD files with two partial expansions as proposed ..."
Abstract
- Add to MetaCart
In this paper we complete the analysis done by Ramakrishna and Mukhopadhyay for a data node in the Bounded Disorder (BD) file organization of Litwin and Lomet, by introducing the B-tree index into the model. Also, we extend the analysis to the case of BD files with two partial expansions as proposed by Lomet. Our main contribution is a detailed analysis of search and insertion costs, and its comparison with B + -trees. 1 Introduction Nowadays there are two main file organizations: hashing and tree indexing. New hashing techniques achieve single access retrieval, but are very inefficient for range search or key sequential access. On the other hand tree indices preserve the key order with a higher search cost. Litwin and Lomet [9] proposed the Bounded Disorder (BD) file organization to combine the advantages of both methods. This paper complements the analysis presented by Ramakrishna and Mukhopadhyay [14] concerning the performance of BD files, by including the index in their model. ...
The Performance Of A Multiversion Access Method
- In Proc. of the ACM SIGMOD Conference
, 1990
"... The Time-Split B-tree is an integrated index structure for a versioned timestamped database. It gradually migrates data from a current database to an historical database, records migrating when nodes split. Records valid at the split time are placed in both an historical node and a current node. Thi ..."
Abstract
- Add to MetaCart
The Time-Split B-tree is an integrated index structure for a versioned timestamped database. It gradually migrates data from a current database to an historical database, records migrating when nodes split. Records valid at the split time are placed in both an historical node and a current node. This implies some redundancy. Using both analysis and simulation, we characterize the amount of redundancy, the space utilization, and the record addition (insert or update) performance for a spectrum of different rates of insertion versus update. Three splitting policies are studied which alter the conditions under which either time splits or key space splits are performed. 1. INTRODUCTION A growing area of interest in the database community is in the support of multiversioned data [LoSa, AhSn, JeMR, Ston]. Multiversioned data, when updated, results in a new version of the data being created. Because the old version is retained, several versions of a record can exist, each appropriate to some...
A Multivariate View of Random Bucket
"... We take a multivariate view of digital search trees by studying the number of nodes of di#erent types that may coexist in a bucket digital search tree as it grows under an arbitrary memory management system. We obtain the mean of each type of node, as well as the entire covariance matrix between ..."
Abstract
- Add to MetaCart
We take a multivariate view of digital search trees by studying the number of nodes of di#erent types that may coexist in a bucket digital search tree as it grows under an arbitrary memory management system. We obtain the mean of each type of node, as well as the entire covariance matrix between types, whereupon weak laws of large numbers follow from the orders of magnitude (the norming constants include oscillating functions). The result can be easily interpreted for practical systems like paging, heaps and UNIX's buddy system. The covariance results call for developing a Mellin convolution method, where convoluted numerical sequences are handled by convolutions of their Mellin transforms. Furthermore, we use a method of moments to show that the distribution is asymptotically normal. The method of proof is of some generality and is applicable to other parameters like path length and size in random tries and Patricia tries.
B-Trees with Lazy Parent Split
"... A B-tree variant that postpones parent node splittings due to upcoming items until a later access of the same node is examined. This technique aims to decrease the possibility of propagating splittings to upper levels so that more concurrency is achieved. Insertion and deletion algorithms are given. ..."
Abstract
- Add to MetaCart
A B-tree variant that postpones parent node splittings due to upcoming items until a later access of the same node is examined. This technique aims to decrease the possibility of propagating splittings to upper levels so that more concurrency is achieved. Insertion and deletion algorithms are given. Time and space performance results are also reported and comparison with conventional B-trees is carried out. It is shown that this technique substantially improves the performance of small degree B-trees so that, indeed, concurrency is enhanced. 1.

