Results 1  10
of
17
Skip Graphs
 Proc. of the 14th Annual ACMSIAM Symp. on Discrete Algorithms
, 2003
"... Skip graphs are a novel distributed data structure, based on skip lists, that provide the full functionality of a balanced tree in a distributed system where resources are stored in separate nodes that may fail at any time. They are designed for use in searching peertopeer systems, and by providin ..."
Abstract

Cited by 235 (9 self)
 Add to MetaCart
Skip graphs are a novel distributed data structure, based on skip lists, that provide the full functionality of a balanced tree in a distributed system where resources are stored in separate nodes that may fail at any time. They are designed for use in searching peertopeer systems, and by providing the ability to perform queries based on key ordering, they improve on existing search tools that provide only hash table functionality. Unlike skip lists or other tree data structures, skip graphs are highly resilient, tolerating a large fraction of failed nodes without losing connectivity. In addition, constructing, inserting new nodes into, searching a skip graph, and detecting and repairing errors in the data structure introduced by node failures can be done using simple and straightforward algorithms. 1
Fast Text Searching for Regular Expressions or Automaton Searching on Tries
"... We present algorithms for efficient searching of regular expressions on preprocessed text, using a Patricia tree as a logical model for the index. We obtain searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in subline ..."
Abstract

Cited by 49 (6 self)
 Add to MetaCart
We present algorithms for efficient searching of regular expressions on preprocessed text, using a Patricia tree as a logical model for the index. We obtain searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in sublinear expected time for any regular expression. This is the first such algorithm to be found with this complexity.
Generalizing Generalized Tries
, 1999
"... A trie is a search tree scheme that employs the structure of search keys to organize information. Tries were originally devised as a means to represent a collection of records indexed by strings over a fixed alphabet. Based on work by C.P. Wadsworth and others, R.H. Connelly and F.L. Morris generali ..."
Abstract

Cited by 31 (8 self)
 Add to MetaCart
A trie is a search tree scheme that employs the structure of search keys to organize information. Tries were originally devised as a means to represent a collection of records indexed by strings over a fixed alphabet. Based on work by C.P. Wadsworth and others, R.H. Connelly and F.L. Morris generalized the concept to permit indexing by elements of an arbitrary monomorphic datatype. Here we go one step further and define tries and operations on tries generically for arbitrary firstorder polymorphic datatypes. The derivation is based on techniques recently developed in the context of polytypic programming. It is well known that for the implementation of generalized tries nested datatypes and polymorphic recursion are needed. Implementing tries for polymorphic datatypes places even greater demands on the type system: it requires rank2 type signatures and higherorder polymorphic nested datatypes. Despite these requirements the definition of generalized tries for polymorphic datatypes is...
Spgist: An extensible database index for supporting space partitioning trees
 J. Intell. Inf. Syst
"... Abstract. Emerging database applications require the use of new indexing structures beyond Btrees and Rtrees. Examples are the kD tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of a ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
Abstract. Emerging database applications require the use of new indexing structures beyond Btrees and Rtrees. Examples are the kD tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of all these indexes is that they recursively divide the space into partitions. A new extensible index structure, termed SPGiST is presented that supports this class of data structures, mainly the class of space partitioning unbalanced trees. Simple method implementations are provided that demonstrate how SPGiST can behave as a kD tree, a trie, a quadtree, or any of their variants. Issues related to clustering tree nodes into pages as well as concurrency control for SPGiST are addressed. A dynamic minimumheight clustering technique is applied to minimize disk accesses and to make using such trees in database systems possible and efficient. A prototype implementation of SPGiST is presented as well as performance studies of the various SPGiST’s tuning parameters. Keywords: spacepartitioning trees, spatial databases, extensible index, generalized search trees, clustering
HMS: A Predictive Text Entry Method Using Bigrams
 Association for Computational Linguistics
, 2003
"... Due to the emergence of SMSmessages, the significance of effective text entry on limitedsize keyboards has increased. ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Due to the emergence of SMSmessages, the significance of effective text entry on limitedsize keyboards has increased.
The oscillatory distribution of distances in random tries
 ANNALS OF APPLIED PROBABILITY
, 2005
"... We investigate ∆n, the distance between randomly selected pairs of nodes among n keys in a random trie, which is a kind of digital tree. Analytical techniques, such as the Mellin transform and an excursion between poissonization and depoissonization, capture small fluctuations in the mean and varian ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
We investigate ∆n, the distance between randomly selected pairs of nodes among n keys in a random trie, which is a kind of digital tree. Analytical techniques, such as the Mellin transform and an excursion between poissonization and depoissonization, capture small fluctuations in the mean and variance of these random distances. The mean increases logarithmically in the number of keys, but curiously enough the variance remains O(1), as n → ∞. It is demonstrated that the centered random variable ∆ ∗ n = ∆n − ⌊2log 2 n ⌋ does not have a limit distribution, but rather oscillates between two distributions.
A SCRABBLE Crossword Game Playing Program
, 1977
"... A program that plays the SCRABBLE Crossword Game has oeen designed and implemented in SIMULA 67 on a DECSystem10 and in Pascal on a CYBER 173. The heart of the design is the data structure for the lexicon and the algorithm for searching i t. The lexicon is represented as a letter table, or trie usi ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
A program that plays the SCRABBLE Crossword Game has oeen designed and implemented in SIMULA 67 on a DECSystem10 and in Pascal on a CYBER 173. The heart of the design is the data structure for the lexicon and the algorithm for searching i t. The lexicon is represented as a letter table, or trie using a canonical ordering of the letters in the words rather than the original spelling. The algorithm takes the trie and a collection of letters, including blanks, and finds a l l words that can be formed from any combination and permutation of the letters. Words are found in approximately the order of their value in the game. 1
Redesigning the String Hash Table, Burst Trie, and BST to Exploit Cache
, 2011
"... A key decision when developing inmemory computing applications is choice of a mechanism to store and retrieve strings. The most efficient current data structures for this task are the hash table with movetofront chains and the burst trie, both of which use linked lists as a substructure, and vari ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A key decision when developing inmemory computing applications is choice of a mechanism to store and retrieve strings. The most efficient current data structures for this task are the hash table with movetofront chains and the burst trie, both of which use linked lists as a substructure, and variants of binary search tree. These data structures are computationally efficient, but typical implementations use large numbers of nodes and pointers to manage strings, which is not efficient in use of cache. In this article, we explore two alternatives to the standard representation: the simple expedient of including the string in its node, and, for linked lists, the more drastic step of replacing each list of nodes by a contiguous array of characters. Our experiments show that, for large sets of strings, the improvement is dramatic. For hashing, in the best case the total space overhead is reduced to less than 1 bit per string. For the burst trie, over 300MB of strings can be stored in a total of under 200MB of memory with significantly improved search time. These results, on a variety of data sets, show that cachefriendly variants of fundamental data structures can yield remarkable gains in performance.
Averagecase analysis of approximate trie search
 IN PROC. 15TH SYMP. ON COMBINATORIAL PATTERN MATCHING (CPM), VOLUME 3109 OF LNCS
, 2004
"... ..."