Results 1  10
of
19
Lookahead and Pathology in Decision Tree Induction
 Proceedings of the 14th International Joint Conference on Artificial Intelligence
, 1995
"... The standard approach to decision tree induction is a topdown, greedy algorithm that makes locally optimal, irrevocable decisions at each node of a tree. In this paper, we study an alternative approach, in which the algorithms use limited lookahead to decide what test to use at a node. We systemati ..."
Abstract

Cited by 54 (2 self)
 Add to MetaCart
(Show Context)
The standard approach to decision tree induction is a topdown, greedy algorithm that makes locally optimal, irrevocable decisions at each node of a tree. In this paper, we study an alternative approach, in which the algorithms use limited lookahead to decide what test to use at a node. We systematically compare, using a very large number of decision trees, the quality of decision trees induced by the greedy approach to that of trees induced using lookahead. The main results of our experiments are: (i) the greedy approach produces trees that are just as accurate as trees produced with the much more expensive lookahead step; and (ii) decision tree induction exhibits pathology, in the sense that lookahead can produce trees that are both larger and less accurate than trees produced without it. 1. Introduction The standard algorithm for constructing decision trees from a set of examples is greedy induction  a tree is induced topdown with locally optimal choices made at each node, with...
General balanced trees
 Journal of Algorithms
, 1999
"... We show that, in order to achieve efficient maintenance of a balanced binary search tree, no shape restriction other than a logarithmic height is required. The obtained class of trees, general balanced trees, may be maintained at a logarithmic amortized cost with no balance information stored in the ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
We show that, in order to achieve efficient maintenance of a balanced binary search tree, no shape restriction other than a logarithmic height is required. The obtained class of trees, general balanced trees, may be maintained at a logarithmic amortized cost with no balance information stored in the nodes. Thus, in the case when amortized bounds are sufficient, there is no need for sophisticated balance criteria. The maintenance algorithms use partial rebuilding. This is important for certain applications and has previously been used with weightbalanced trees. We show that the amortized cost incurred by general balanced trees is lower than what has been shown for weightbalanced trees. � 1999 Academic Press 1.
An Evaluation of Selfadjusting Binary Search Tree Techniques
 Software Practice and Experience
, 1993
"... Much has been said in praise of... this paper, we compare the performance of three different techniques for selfadjusting trees with that of AVL and random binary search trees. Comparisons are made for various tree sizes, levels of keyaccessfrequency skewness and ratios of insertions and deletion ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Much has been said in praise of... this paper, we compare the performance of three different techniques for selfadjusting trees with that of AVL and random binary search trees. Comparisons are made for various tree sizes, levels of keyaccessfrequency skewness and ratios of insertions and deletions to searches. The results show that, because of the high cost of maintaining selfadjusting trees, in almost all cases the AVL tree outperforms all the selfadjusting trees and in many cases even a random binary search tree has better performance, in terms of CPU time, than any of the selfadjusting trees. Selfadjusting trees seem to perform best in a highly dynamic environment, contrary to intuition.
Fast updating of wellbalanced trees
 In SWAT 90, 2nd Scandinavian Workshop on Algorithm Theory
, 1990
"... Trees of optimal and nearoptimal height may be represented as a pointerfree structure in an array of size O(n). In this way we obtain an array implementation of a dictionary with O(log n) search cost and O(log2 n) update cost, allowing interpolation search to improve the expected search time. 1 In ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
(Show Context)
Trees of optimal and nearoptimal height may be represented as a pointerfree structure in an array of size O(n). In this way we obtain an array implementation of a dictionary with O(log n) search cost and O(log2 n) update cost, allowing interpolation search to improve the expected search time. 1 Introduction The binary search tree is a fundamental and well studied data structure, commonly used in computer applications to implement the abstract data type dictionary. In a comparisonbased model of computation, the lower bound on the three basic operations insert, delete and search is dlog(n + 1)e comparisons per operation. This bound may be achieved by storing the set in a binary search tree of optimal height. Definition 1 A binary tree has optimal height if and only if the height of the tree is dlog(n + 1)e. A special case of a tree of optimal height is an optimally balanced tree, as defined below. Definition 2 A binary tree is optimally balanced if and only if the difference in length between the longest and shortest paths is at most one.
Binary Search Trees of Almost Optimal Height
 ACTA INFORMATICA
, 1990
"... First we present a generalization of symmetric binary Btrees, SBB(k) trees. The obtained structure has a height of only \Sigma (1 + 1k) log(n + 1)\Upsilon, where k may be chosen to be any positive integer. The maintenance algorithms require only a constant number of rotations per updating operati ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
First we present a generalization of symmetric binary Btrees, SBB(k) trees. The obtained structure has a height of only \Sigma (1 + 1k) log(n + 1)\Upsilon, where k may be chosen to be any positive integer. The maintenance algorithms require only a constant number of rotations per updating operation in the worst case. These properties together with the fact that the structure is relatively simple to implement makes it a useful alternative to other search trees in practical applications. Then, by using an SBB(k)tree with a varying k we achieve a structure with a logarithmic amortized cost per update and a height of log n + o(log n). This result is an improvement of the upper bound on the height of a dynamic binary search tree. By maintaining two trees simultaneously the amortized cost is transformed into a worstcase cost. Thus, we have improved the worstcase complexity of the dictionary problem.
Concurrent Perfect Balancing of Binary Search Trees
"... When a balanced data structure is updated and searched concurrently, updating and balancing should be decoupled so as to make updating faster. The balancing is done by special maintenance processes that run concurrently with the search and update tasks. We show that it is not necessary to use a wea ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
When a balanced data structure is updated and searched concurrently, updating and balancing should be decoupled so as to make updating faster. The balancing is done by special maintenance processes that run concurrently with the search and update tasks. We show that it is not necessary to use a weak balance condition like AVL or redblack condition, since balancing a binary tree perfectly so that the search paths become as short as possible is not much more expensive, that is, a process must lock only 5 nodes at a time even when perfect balance is desired. In contrast to other algorithms that perfectly balance a binary search tree, our algorithm keeps the tree (weakly) balanced during the further balancing. This is important if the data structure is used by concurrent search and update processes.
Memory Reference Locality in Binary Search Trees
, 1995
"... Balanced binary search trees are widely used main memory index structures. They provide for logarithmic cost for searching, insertion, deletion, and efficient ordered scanning of keys. Long term trends in computer technology have emphasized the effect of memory reference locality on algorithm perfor ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Balanced binary search trees are widely used main memory index structures. They provide for logarithmic cost for searching, insertion, deletion, and efficient ordered scanning of keys. Long term trends in computer technology have emphasized the effect of memory reference locality on algorithm performance. For example, the search performance of large structurally equivalent binary trees can double if nodes are located optimally in memory relative to each other. Unfortunately the traditional Random Access Memory (RAM) model cannot distinguish algorithms with good memory reference locality from algorithms with poor memory reference locality. We therefore define a new ...
On Consulting a Set of Experts and Searching
, 1996
"... Two chapters of this thesis analyze expert consulting problems via game theoretic models; the first points out a close connection between the problem of consulting a set of experts and the problem of searching. The last chapter presents a solution to the dictionary problem of supporting and update ( ..."
Abstract
 Add to MetaCart
Two chapters of this thesis analyze expert consulting problems via game theoretic models; the first points out a close connection between the problem of consulting a set of experts and the problem of searching. The last chapter presents a solution to the dictionary problem of supporting and update (Insert and Delete) operations on a set of key values. The first chapter shows...
Collision Resolution in Hash Tables for Vocabulary Accumulation During Parallel Indexing
"... During indexing the vocabulary of a collection needs to be built. The structure used for this needs to account for the skew distribution of terms. Parallel indexing allows for a large reduction in number of times the global vocabulary needs to be examined, however, this also raises a new set of chal ..."
Abstract
 Add to MetaCart
(Show Context)
During indexing the vocabulary of a collection needs to be built. The structure used for this needs to account for the skew distribution of terms. Parallel indexing allows for a large reduction in number of times the global vocabulary needs to be examined, however, this also raises a new set of challenges. In this paper we examine the structures used to resolve collisions in a hash table during parallel indexing, and nd that the best structure is different from those suggested previously.
Puneet Kumar
"... Exponential Tree in the form of forest is proposed in such a manner that (a) it provides faster access of a node and, (b) it becomes more compatible with the parallel environment. Empirically, it has been show that the proposed method decreases the total internal path length of an Exponential Tree ..."
Abstract
 Add to MetaCart
Exponential Tree in the form of forest is proposed in such a manner that (a) it provides faster access of a node and, (b) it becomes more compatible with the parallel environment. Empirically, it has been show that the proposed method decreases the total internal path length of an Exponential Tree quite considerably. The experiments were conducted by creating three different data structures using the same input a conventional binary tree, a forest of hashed binary trees and a forest of hashed exponential trees. It has been shown that a forest of hashed exponential trees so produced has lesser internal path length and height in comparison of other two. It also increases the degree of parallelism.