Results 1 
9 of
9
Reducing the Space Requirement of Suffix Trees
 Software – Practice and Experience
, 1999
"... We show that suffix trees store various kinds of redundant information. We exploit these redundancies to obtain more space efficient representations. The most space efficient of our representations requires 20 bytes per input character in the worst case, and 10.1 bytes per input character on average ..."
Abstract

Cited by 118 (10 self)
 Add to MetaCart
We show that suffix trees store various kinds of redundant information. We exploit these redundancies to obtain more space efficient representations. The most space efficient of our representations requires 20 bytes per input character in the worst case, and 10.1 bytes per input character on average for a collection of 42 files of different type. This is an advantage of more than 8 bytes per input character over previous work. Our representations can be constructed without extra space, and as fast as previous representations. The asymptotic running times of suffix tree applications are retained. Copyright © 1999 John Wiley & Sons, Ltd. KEY WORDS: data structures; suffix trees; implementation techniques; space reduction
A simple optimal representation for balanced parentheses
 In Proc. 15th Annual Symposium on Combinatorial Pattern Matching (CPM), LNCS v. 3109 (2004
, 2004
"... b Institute of Mathematical Sciences, Chennai 600 113, India. We consider succinct, or highly spaceefficient, representations of a (static) string consisting of n pairs of balanced parentheses, that support natural operations such as finding the matching parenthesis for a given parenthesis, or find ..."
Abstract

Cited by 41 (3 self)
 Add to MetaCart
b Institute of Mathematical Sciences, Chennai 600 113, India. We consider succinct, or highly spaceefficient, representations of a (static) string consisting of n pairs of balanced parentheses, that support natural operations such as finding the matching parenthesis for a given parenthesis, or finding the pair of parentheses that most tightly enclose a given pair. This problem was considered by Jacobson, [Proc. 30th FOCS, 549–554, 1989] and Munro and Raman, [SIAM J. Comput. 31 (2001), 762–776], who gave O(n)bit and 2n + o(n)bit representations, respectively, that supported the above operations in O(1) time on the RAM model of computation. This data structure is a fundamental tool in succinct representations, and has applications in representing suffix trees, ordinal trees, planar graphs and permutations. We consider the practical performance of parenthesis representations. First, we give a new 2n + o(n)bit representation that supports all the above operations in O(1) time. This representation is conceptually simpler, its space bound has a smaller o(n) term and it also has a simple and uniform o(n) time and space construction algorithm. We implement our data structure and a variant of Jacobson’s, and evaluate their practical performance (speed and memory usage), when used in a succinct representation of trees derived from XML documents. As a baseline, we compare our representations against a widelyused implementation of the standard DOM (Document Object Model) representation of XML documents. Both succinct representations use orders of magnitude less space than DOM and tree traversal operations are usually only slightly slower than in DOM. Key words: Succinct data structures, parentheses representation of trees, compressed dictionaries, XML DOM. Preprint submitted to Theoretical Computer Science 29 November 2006 1
Correcting spelling errors by modelling their causes
 International Journal of Applied Mathematics and Computer Science
, 2005
"... This paper accounts for a new technique of correcting isolated words in typed texts. A languagedependent set of string substitutions reflects the surface form of errors that result from vocabulary incompetence, misspellings, or mistypings. Candidate corrections are formed by applying the substituti ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
This paper accounts for a new technique of correcting isolated words in typed texts. A languagedependent set of string substitutions reflects the surface form of errors that result from vocabulary incompetence, misspellings, or mistypings. Candidate corrections are formed by applying the substitutions to text words absent from the computer lexicon. A minimal acyclic deterministic finite automaton storing the lexicon allows quick rejection of nonsense corrections, while costs associated with the substitutions serve to rank the remaining ones. A comparison of the correction lists generated by several spellcheckers for two corpora of English spelling errors shows that our technique suggests the right words more accurately than the others.
Implementing a Dynamic Compressed Trie
 PROCEEDINGS WAE'98, SAARBRUCKEN, GERMANY, AUGUST 2022, 1998
, 1998
"... We present an orderpreserving general purpose data structure for binary data, the LPCtrie. The structure ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We present an orderpreserving general purpose data structure for binary data, the LPCtrie. The structure
An Experimental Study of Compression Methods for Dynamic Tries
"... We study an orderpreserving general purpose data structure for binary data, the LPCtrie. The structure is a compressed trie, using both level and path compression. The memory usage is similar to that of a balanced binary search tree, but the expected average depth is smaller. The LPCtrie is well ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We study an orderpreserving general purpose data structure for binary data, the LPCtrie. The structure is a compressed trie, using both level and path compression. The memory usage is similar to that of a balanced binary search tree, but the expected average depth is smaller. The LPCtrie is well suited to modern language environments with ecient memory allocation and garbage collection. We present an implementation in the Java programming language and show that the structure compares favorably to a balanced binary search tree.
Statistical Models for Term Compression
 In Data Compression Conference
, 2000
"... Symbolic tree data structures, or terms, are used in many computing systems. Although terms can be compressed by hand, using specialized algorithms, or using universal compression utilities, all of these approaches have drawbacks. We propose an approach which avoids these problems by using knowle ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Symbolic tree data structures, or terms, are used in many computing systems. Although terms can be compressed by hand, using specialized algorithms, or using universal compression utilities, all of these approaches have drawbacks. We propose an approach which avoids these problems by using knowledge of term structure to obtain more accurate predictive models for term compression. We describe two models that predict child symbols based on their parents and locations. Our experiments compared these models with firstorder Markov sequence models using Hu#man coding and found that one model can obtain 20% better compression in similar time, and the other, simpler model can obtain similar compression 40% faster. These compression models also approach, but do not exceed, the performance of gzip.
An Experimental Study of Compression Methods for Functional Tries
 in: Workshop on Algorithmic Aspects of Advanced Programming Languages (WAAAPL'99
, 1999
"... We develop compression methods for functional tries and study them experimentally. Trie compression is usually implemented either as a combination of path compression and width compression or as a combination of path compression and level compression. We develop a new efficient implementation for wi ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We develop compression methods for functional tries and study them experimentally. Trie compression is usually implemented either as a combination of path compression and width compression or as a combination of path compression and level compression. We develop a new efficient implementation for width compression and show that in functional tries the combination of all of the three compression methods yields the best results when taking into account the trie size, the trie depth, the copy cost, and the search and update performance. We also show that the 23 tree minimizes the copy cost in balanced trees and compare our experimental results for tries to approximate analytical results for internal and external 23 trees. Our conclusion is that the path, width, and levelcompressed trie is an ideal choice for a functional mainmemory index structure. Keywords functional data structures, imperative data structures, trie, shadowing, path copying, path compression, width compression, level compression 1
Proceedings WAE'98, Saarbrucken, Germany, August 2022, 1998
"... We present an orderpreserving general purpose data structure for binary data, the LPCtrie. ..."
Abstract
 Add to MetaCart
We present an orderpreserving general purpose data structure for binary data, the LPCtrie.
unknown title
"... Centre d'études et de recherches en informatique linguistique Université de MarnelaVallée, France ..."
Abstract
 Add to MetaCart
Centre d'études et de recherches en informatique linguistique Université de MarnelaVallée, France