Results 1 
5 of
5
The ubiquitous Btree
 ACM Computing Surveys
, 1979
"... Btrees have become, de facto, a standard for file organization. File indexes of users, dedicated database systems, and generalpurpose access methods have all been proposed and nnplemented using Btrees This paper reviews Btrees and shows why they have been so successful It discusses the major var ..."
Abstract

Cited by 553 (0 self)
 Add to MetaCart
Btrees have become, de facto, a standard for file organization. File indexes of users, dedicated database systems, and generalpurpose access methods have all been proposed and nnplemented using Btrees This paper reviews Btrees and shows why they have been so successful It discusses the major variations of the Btree, especially the B+tree,
Lsh forest: selftuning indexes for similarity search
 In WWW
, 2005
"... We consider the problem of indexing highdimensional data for answering (approximate) similaritysearch queries. Similarity indexes prove to be important in a wide variety of settings: Web search engines desire fast, parallel, mainmemorybased indexes for similarity search on text data; database sy ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
We consider the problem of indexing highdimensional data for answering (approximate) similaritysearch queries. Similarity indexes prove to be important in a wide variety of settings: Web search engines desire fast, parallel, mainmemorybased indexes for similarity search on text data; database systems desire diskbased similarity indexes for highdimensional data, including text and images; peertopeer systems desire distributed similarity indexes with low communication cost. We propose an indexing scheme called LSH Forest which is applicable in all the above contexts. Our index uses the wellknown technique of localitysensitive hashing (LSH), but improves upon previous designs by (a) eliminating the different datadependent parameters for which LSH must be constantly handtuned, and (b) improving on LSH’s performance guarantees for skewed data distributions while retaining the same storage and query overhead. We show how to construct this index in main memory, on disk, in parallel systems, and in peertopeer systems. We evaluate the design with experiments on multiple text corpora and demonstrate both the selftuning nature and the superior performance of LSH Forest.
Efficient Optimal Pagination of Scrolls
 COMMUNICATIONS OF THE ACM
, 1985
"... Diehr and Faaland developed an algorithm that finds the minimum sum of key length pagination of a scroll of n items, and which uses O(n lg n) time and O(n) space, solving a problem posed by McCreight. An improved algorithm is given which uses O(n) time and O(n) space. ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Diehr and Faaland developed an algorithm that finds the minimum sum of key length pagination of a scroll of n items, and which uses O(n lg n) time and O(n) space, solving a problem posed by McCreight. An improved algorithm is given which uses O(n) time and O(n) space.
New Applications of Failure Functions
 JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY
, 1987
"... Several algorithms are presented whose operations are governed by a principle of failure functions: when searching for an extremal value within a sequence, it suffices to consider only the subsequence of items each of which is the first possible improvement of its predecessor. These algorithms are m ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Several algorithms are presented whose operations are governed by a principle of failure functions: when searching for an extremal value within a sequence, it suffices to consider only the subsequence of items each of which is the first possible improvement of its predecessor. These algorithms are more efficient than their more traditional counterparts.
Performance guarantees for Btrees . . .
, 2010
"... Most Btree papers assume that all N keys have the same size K, that f = B/K keys fit in a disk block, and therefore that the search cost is O(log f+1 N) block transfers. When keys have variable size, however, Btree operations have no nontrivial performance guarantees. This paper provides Btreeli ..."
Abstract
 Add to MetaCart
Most Btree papers assume that all N keys have the same size K, that f = B/K keys fit in a disk block, and therefore that the search cost is O(log f+1 N) block transfers. When keys have variable size, however, Btree operations have no nontrivial performance guarantees. This paper provides Btreelike performance guarantees on dictionaries that contain keys of different sizes in a model in which keys must be stored and compared as opaque objects. The resulting atomickey dictionaries exhibit performance bounds in terms of the average key size and match the bounds when all keys are the same size. Atomic key dictionaries can be built with minimal modification to the Btree structure, simply by choosing the pivot keys properly. This paper describes both static and dynamic atomickey dictionaries. In the static case, if there are N keys with average size K, the search cost is O(⌈K/B ⌉ log 1+⌈B/K ⌉ N) expected transfers. The paper proves that it is not possible to transform these expected bounds into worstcase bounds. The cost to build the tree is O(NK) operations and O(NK/B) transfers if all keys are presented in sorted order. If not, the cost is the sorting cost. For the dynamic dictionaries, the amortized cost to insert a key κ of arbitrary length at an arbitrary rank is dominated by the cost to search for κ. Specifically the amortized cost to insert a key κ of arbitrary length and random rank is O(⌈K/B ⌉ log 1+⌈B/K ⌉ N + κ  /B) transfers. A dynamicprogramming algorithm is shown for constructing a search tree with minimal expected cost.