Results 1 -
9 of
9
Fast Text Searching for Regular Expressions or Automaton Searching on Tries
"... We present algorithms for efficient searching of regular expressions on preprocessed text, using a Patricia tree as a logical model for the index. We obtain searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in subline ..."
Abstract
-
Cited by 43 (6 self)
- Add to MetaCart
We present algorithms for efficient searching of regular expressions on preprocessed text, using a Patricia tree as a logical model for the index. We obtain searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in sublinear expected time for any regular expression. This is the first such algorithm to be found with this complexity.
Fringe Analysis of Synchronized Parallel Algorithms on 2--3 Trees
"... . We are interested in the fringe analysis of synchronized parallel insertion algorithms on 2--3 trees, namely the algorithm of W. Paul, U. Vishkin and H. Wagener (PVW). This algorithm inserts k keys into a tree of size n with parallel time O(log n + log k). Fringe analysis studies the distribution ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. We are interested in the fringe analysis of synchronized parallel insertion algorithms on 2--3 trees, namely the algorithm of W. Paul, U. Vishkin and H. Wagener (PVW). This algorithm inserts k keys into a tree of size n with parallel time O(log n + log k). Fringe analysis studies the distribution of the bottom subtrees and it is still an open problem for parallel algorithms on search trees. To tackle this problem we introduce a new kind of algorithms whose two extreme cases seems to upper and lower bounds the performance of the PVW algorithm. We extend the fringe analysis to parallel algorithms and we get a rich mathematical structure giving new interpretations even in the sequential case. The process of insertions is modeled by a Markov chain and the coefficients of the transition matrix are related with the expected local behavior of our algorithm. Finally, we show that this matrix has a power expansion over (n+1) \Gamma1 where the coefficients are the binomial transform of the ...
An Adaptive Overflow Technique for B-trees
- Extending Data Base Technology Conference (EDBT 90
, 1990
"... We present a new overflow technique for B-trees. The technique is a hybrid of partial expansions and unbalanced splits. This technique is asymmetric and adaptive. Considering a growing file (only insertions), the storage utilization is 77% for random keys, 70% for sorted keys, and over 75% for non-u ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present a new overflow technique for B-trees. The technique is a hybrid of partial expansions and unbalanced splits. This technique is asymmetric and adaptive. Considering a growing file (only insertions), the storage utilization is 77% for random keys, 70% for sorted keys, and over 75% for non-uniform distributed keys. Similar results are achieved when we have deletions mixed with insertions. One of the main properties of this technique is that the storage utilization is very stable with respect to changes of the data distribution. This technique may be used for other bucket-based file structures, like extendible hashing or bounded disorder files. 1 Introduction The B + -tree is one of the most widely used file organizations. In a B + -tree all the information is stored at the lowest level (buckets), and the upper levels are a B-tree index. File growth is handled by bucket splitting, that is, when a bucket overflows, a new bucket is allocated and half of the records from the o...
Fringe Analysis of Synchronized Parallel Insertion Algorithms on 2-3 Trees
"... . The fringe analysis studies the distribution of bottom subtrees or fringe of trees under the assumption of random selection of keys, yielding an average case analysis of the fringe of trees. We are interested in the fringe analysis of the synchronized parallel insertion algorithms of Paul, Vishkin ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
. The fringe analysis studies the distribution of bottom subtrees or fringe of trees under the assumption of random selection of keys, yielding an average case analysis of the fringe of trees. We are interested in the fringe analysis of the synchronized parallel insertion algorithms of Paul, Vishkin, and Wagener (PVW) on 2--3 trees. This algorithm inserts k keys with k processors into a tree of size n with time O(log n+ log k). As the direct analysis of this algorithm is very difficult we tackle this problem by introducing a new family of algorithms, denoted MacroSplit algorithms, and our main theorem proves that two algorithms of this family, denoted MaxMacroSplit and MinMacroSplit, upper and lower bounds the fringe of the PVW algorithm. Published papers deal with the fringe analysis of sequential algorithms and it was an open problem for parallel algorithms on search trees. We extend the fringe analysis to parallel algorithms and we get a rich mathematical structure giving new interp...
Left-leaning Red-Black Trees
"... The red-black tree model for implementing balanced search trees, introduced by Guibas and Sedgewick thirty years ago, is now found throughout our computational infrastructure. Red-black trees are described in standard textbooks and are the underlying data structure for symbol-table implementations w ..."
Abstract
- Add to MetaCart
The red-black tree model for implementing balanced search trees, introduced by Guibas and Sedgewick thirty years ago, is now found throughout our computational infrastructure. Red-black trees are described in standard textbooks and are the underlying data structure for symbol-table implementations within C++, Java, Python, BSD Unix, and many other modern systems. However, many of these implementations have sacrificed some of the original design goals (primarily in order to develop an effective implementation of the delete operation, which was incompletely specified in the original paper), so a new look is worthwhile. In this paper, we describe a new variant of redblack trees that meets many of the original design goals and leads to substantially simpler code for insert/delete, less than one-fourth as much code as in implementations in common use. All red-black trees are based on implementing 2-3 or 2-3-4 trees within a binary tree, using red links to bind together internal nodes into 3-nodes or 4-nodes. The new code is based on combining three ideas: • Use a recursive implementation. • Require that all 3-nodes lean left.
MacroSplit
"... . We extend the fringe analysis (used to study the expected behavior of balanced search trees under sequential insertions) to deal with synchronous parallel insertions on 2--3 trees. Given an insertion of k keys in a tree with n nodes, the fringe evolves following the transition matrix: Tn;k = ` ..."
Abstract
- Add to MetaCart
. We extend the fringe analysis (used to study the expected behavior of balanced search trees under sequential insertions) to deal with synchronous parallel insertions on 2--3 trees. Given an insertion of k keys in a tree with n nodes, the fringe evolves following the transition matrix: Tn;k = ` 1 + k n + 1 ' I + k X j=0 (\Gamma1) j (n + 1) j / k j ! ` ff j \Gammafi j \Gammaff j fi j ' : where the coefficients ff j and fi j take care of the precise form of the algorithm but does not depend on k or n. The derivation of this matrix uses the binomial transform recently developed by P. Poblete, J. Munro and Th. Papadakis. Due to the complexity of the preceding exact analysis, we develop also two approximations. A first one based on a simplified parallel model, and a second one based on the sequential model. These two approximated analysis prove that the parallel insertions case does not differ significantly from the sequential case, namely on the terms O(1=n 2 ). Ke...
Higher-Order Analysis of 2-3 Trees
- Int. J. Foundations Comp. Sci
, 1995
"... We present a fourth-order fringe analysis for the expected behavior of 2-3 trees, which includes 97% of the elements in the tree. It is accomplished by exploiting the structure of the transition matrix. Our results improve a number of bounds, in particular the bounds on the expected number of nodes ..."
Abstract
- Add to MetaCart
We present a fourth-order fringe analysis for the expected behavior of 2-3 trees, which includes 97% of the elements in the tree. It is accomplished by exploiting the structure of the transition matrix. Our results improve a number of bounds, in particular the bounds on the expected number of nodes and the expected space utilization. We also study 2-3 trees built by using overflow techniques. 1 Introduction Fringe analysis was formally introduced by Yao in 1974 [Yao74, Yao78] as a method to analyze search trees that considers only the bottom part or fringe of the tree. From the behavior of the subtrees in the fringe, it is possible to obtain bounds on most complexity measures for the complete tree, as well as some exact results. Classical fringe analysis considers only insertions. The model assumes that the n! possible permutations of the n keys used as input are equally likely. A search tree built under this model is called a random tree. This is equivalent to saying that the n-th in...
MacroSplit
"... . We extend the fringe analysis (used to study the expected behavior of balanced search trees under sequential insertions) to deal with synchronous parallel insertions on 2--3 trees. Given an insertion of k keys in a tree with n nodes, the fringe evolves following the transition matrix: Tn;k = ` ..."
Abstract
- Add to MetaCart
. We extend the fringe analysis (used to study the expected behavior of balanced search trees under sequential insertions) to deal with synchronous parallel insertions on 2--3 trees. Given an insertion of k keys in a tree with n nodes, the fringe evolves following the transition matrix: Tn;k = ` 1 + k n + 1 ' I + k X j=0 (\Gamma1) j (n + 1) j / k j ! ` ff j \Gammafi j \Gammaff j fi j ' : where the coefficients ff j and fi j take care of the precise form of the algorithm but does not depend on k or n. The derivation of this matrix uses the binomial transform recently developed by P. Poblete, J. Munro and Th. Papadakis. Due to the complexity of the preceding exact analysis, we develop also two approximations. A first one based on a simplified parallel model, and a second one based on the sequential model. These two approximated analysis prove that the parallel insertions case does not differ significantly from the sequential case, namely on the terms O(1=n 2 ). Ke...
Randomized Load Balancing by Joining and Splitting Bins
, 2008
"... We study the following load balancing game: initially there is only one bin, which contains all the load; and at each time, either one of the bins is split into two bins, or two bins are joined into one bin. The join operation is defined as joining two uniformly random bins. Two types of split opera ..."
Abstract
- Add to MetaCart
We study the following load balancing game: initially there is only one bin, which contains all the load; and at each time, either one of the bins is split into two bins, or two bins are joined into one bin. The join operation is defined as joining two uniformly random bins. Two types of split operations are considered: a weighted split, where the bin to split is sampled according to the probability distribution proportional to the weights of bins; and a non-weighted split, where the bin to split is uniformly sampled from all bins. We analyze the load factor of the bins, which is the ratio between the maximum load and the average load. We show that for weighted splits with uniform joins, the expected load factor of n bins is always Θ(log n), which is independent of the sequence of joins and splits. For non-weighted splits, we show that the expected load factor after applying n non-weighted splits to one initial bin is between Ω(n 0.5) and O(n 0.741). We study the performance of the mixing of joins and non-weighted splits, and show that the expected load factor approaches O(n1/ √ 1 2 log2 n) after alternatively applying sufficiently many joins and non-weighted splits to an arbitrary initial load assignment of n bins.

