Results 1 - 10
of
12
Skip graphs
- in SODA
, 2003
"... Skip graphs are a novel distributed data structure, based on skip lists, that provide the full functionality of a balanced tree in a distributed system where resources are stored in separate nodes that may fail at any time. They are designed for use in searching peer-to-peer systems, and by providin ..."
Abstract
-
Cited by 202 (8 self)
- Add to MetaCart
Skip graphs are a novel distributed data structure, based on skip lists, that provide the full functionality of a balanced tree in a distributed system where resources are stored in separate nodes that may fail at any time. They are designed for use in searching peer-to-peer systems, and by providing the ability to perform queries based on key ordering, they improve on existing search tools that provide only hash table functionality. Unlike skip lists or other tree data structures, skip graphs are highly resilient, tolerating a large fraction of failed nodes without losing connectivity. In addition, simple and straightforward algorithms can be used to construct a skip graph, insert new nodes into it, search it, and detect and repair errors in a skip graph introduced due to node failures.
Fast Text Searching for Regular Expressions or Automaton Searching on Tries
"... We present algorithms for efficient searching of regular expressions on preprocessed text, using a Patricia tree as a logical model for the index. We obtain searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in subline ..."
Abstract
-
Cited by 43 (6 self)
- Add to MetaCart
We present algorithms for efficient searching of regular expressions on preprocessed text, using a Patricia tree as a logical model for the index. We obtain searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in sublinear expected time for any regular expression. This is the first such algorithm to be found with this complexity.
Generalizing Generalized Tries
, 1999
"... A trie is a search tree scheme that employs the structure of search keys to organize information. Tries were originally devised as a means to represent a collection of records indexed by strings over a fixed alphabet. Based on work by C.P. Wadsworth and others, R.H. Connelly and F.L. Morris generali ..."
Abstract
-
Cited by 29 (8 self)
- Add to MetaCart
A trie is a search tree scheme that employs the structure of search keys to organize information. Tries were originally devised as a means to represent a collection of records indexed by strings over a fixed alphabet. Based on work by C.P. Wadsworth and others, R.H. Connelly and F.L. Morris generalized the concept to permit indexing by elements of an arbitrary monomorphic datatype. Here we go one step further and define tries and operations on tries generically for arbitrary first-order polymorphic datatypes. The derivation is based on techniques recently developed in the context of polytypic programming. It is well known that for the implementation of generalized tries nested datatypes and polymorphic recursion are needed. Implementing tries for polymorphic datatypes places even greater demands on the type system: it requires rank-2 type signatures and higher-order polymorphic nested datatypes. Despite these requirements the definition of generalized tries for polymorphic datatypes is...
HMS: A Predictive Text Entry Method Using Bigrams
- Association for Computational Linguistics
, 2003
"... Due to the emergence of SMSmessages, the significance of effective text entry on limited-size keyboards has increased. ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Due to the emergence of SMSmessages, the significance of effective text entry on limited-size keyboards has increased.
Sp-gist: An extensible database index for supporting space partitioning trees
- J. Intell. Inf. Syst
"... Abstract. Emerging database applications require the use of new indexing structures beyond B-trees and R-trees. Examples are the k-D tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of a ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Abstract. Emerging database applications require the use of new indexing structures beyond B-trees and R-trees. Examples are the k-D tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of all these indexes is that they recursively divide the space into partitions. A new extensible index structure, termed SP-GiST is presented that supports this class of data structures, mainly the class of space partitioning unbalanced trees. Simple method implementations are provided that demonstrate how SP-GiST can behave as a k-D tree, a trie, a quadtree, or any of their variants. Issues related to clustering tree nodes into pages as well as concurrency control for SP-GiST are addressed. A dynamic minimum-height clustering technique is applied to minimize disk accesses and to make using such trees in database systems possible and efficient. A prototype implementation of SP-GiST is presented as well as performance studies of the various SP-GiST’s tuning parameters. Keywords: space-partitioning trees, spatial databases, extensible index, generalized search trees, clustering
Average-case analysis of approximate trie search
- In Proc. 15th Symp. on Combinatorial Pattern Matching (CPM), volume 3109 of LNCS
, 2004
"... Nachdruck auch auszugsweise verboten ..."
Distribution of inter-node distances in digital trees
- in 2005 International Conference on Analysis of Algorithms, C. Martínez (ed.), Discrete Mathematics and Theoretical Computer Science, Proceedings AD
, 2005
"... We investigate distances between pairs of nodes in digital trees (digital search trees (DST), and tries). By analytic techniques, such as the Mellin Transform and poissonization, we describe a program to determine the moments of these distances. The program is illustrated on the mean and variance. O ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We investigate distances between pairs of nodes in digital trees (digital search trees (DST), and tries). By analytic techniques, such as the Mellin Transform and poissonization, we describe a program to determine the moments of these distances. The program is illustrated on the mean and variance. One encounters delayed Mellin transform equations, which we solve by inspection. Interestingly, the unbiased case gives a bounded variance, whereas the biased case gives a variance growing with the number of keys. It is therefore possible in the biased case to show that an appropriately normalized version of the distance converges to a limit. The complexity of moment calculation increases substantially with each higher moment; A shortcut to the limit is needed via a method that avoids the computation of all moments. Toward this end, we utilize the contraction method to show that in biased digital search trees the distribution of a suitably normalized version of the distances approaches a limit that is the fixed-point solution (in the Wasserstein space) of a distributional equation. An explicit solution to the fixed-point equation is readily demonstrated to be Gaussian.
PAT-Trees with the Deletion Function as the Learning Device for Linguistic Patterns
"... In this study, a learning device based on the PATtree data structures was developed. The original PAT-trees were enhanced with the deletion function to emulate human learning competence. The learning process worked as follows. The linguistic patterns from the text corpus are inserted into the PAT-tr ..."
Abstract
- Add to MetaCart
In this study, a learning device based on the PATtree data structures was developed. The original PAT-trees were enhanced with the deletion function to emulate human learning competence. The learning process worked as follows. The linguistic patterns from the text corpus are inserted into the PAT-tree one by one. Since the memory was limited, hopefully, the important and new patterns would be retained in the PAT-tree and the old and unimportant patterns would be released from the tree automatically. The proposed PAT-trees with the deletion function have the following advantages. 1) They are easy to construct and maintain. 2) Any prefix substring and its frequency count through PAT-tree can be searched very quickly. 3) The space requirement for a PAT-tree is linear with respect to the size of the input text. 4) The insertion of a new element can be carried out at any time without being blocked by the memory constraints because the free space is released through the deletion of unimportant elements.
© 2007 Science Publications Query Based Client Indexing in Client/Server Information Systems
"... Abstract: One issue in client/server information systems is the storage of the relationships between clients and data used by these clients. In particular in scenarios that allow the caching of data on the client site, this information can be used in order to keep the global database consistent. Thu ..."
Abstract
- Add to MetaCart
Abstract: One issue in client/server information systems is the storage of the relationships between clients and data used by these clients. In particular in scenarios that allow the caching of data on the client site, this information can be used in order to keep the global database consistent. Thus, if the data on the server are updated, it is possible to detect caches affected by the update. In a following Step it is possible either to patch or to invalidate these caches. In this study we discuss approaches that use posted queries in order to index the clients on the server site.
Applied Probability Trust (13 May 2008) AVERAGE-CASE ANALYSIS OF COUSINS IN m–ARY TRIES
"... We investigate the average similarity of random strings as captured by the average number of “cousins ” in the underlying tree structures. Analytical techniques including poissonization and the Mellin transform are used for accurate calculation of the mean. The string alphabets we consider are m– ar ..."
Abstract
- Add to MetaCart
We investigate the average similarity of random strings as captured by the average number of “cousins ” in the underlying tree structures. Analytical techniques including poissonization and the Mellin transform are used for accurate calculation of the mean. The string alphabets we consider are m– ary, and the corresponding trees are m–ary trees. Certain analytic issues arise in the m–ary case that do not have an analog in the binary case.

