Results 1 -
4 of
4
Verifying Candidate Matches in Sparse and Wildcard Matching (Extended Abstract)
, 2002
"... This paper obtains the following results on pattern matching problems in which the text has length n and the pattern has length m. ..."
Abstract
-
Cited by 35 (3 self)
- Add to MetaCart
This paper obtains the following results on pattern matching problems in which the text has length n and the pattern has length m.
Perfect hashing for strings: Formalization and Algorithms
- IN PROC 7TH CPM
, 1996
"... Numbers and strings are two objects manipulated by most programs. Hashing has been well-studied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings th ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Numbers and strings are two objects manipulated by most programs. Hashing has been well-studied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly efficient sequential/parallel (CRCW PRAM) Las Vegas type algorithms that determine a perfect hash function for substring hashing. For example, given a binary string of length n, one of our algorithms finds a perfect hash function in O(log n) time, O(n) work, and O(n) space; the hash value for any substring can then be computed in O(log log n) time using a single processor. Our approach relies on a novel use of the suffix tree of a string. In implementing our approach, we design optimal parallel algorithms for the problem of determining weighted ancestors on a edge-weighted tree that may be of independent interest.
The architecture of a software library for string processing
, 1997
"... We present our project to develop a software library of basic tools and data structures for string processing. Our goal is to provide an environment for testing new algorithms as well as for prototyping. The library has a natural hierarchy comprising basic objects such as the alphabet and strings, d ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present our project to develop a software library of basic tools and data structures for string processing. Our goal is to provide an environment for testing new algorithms as well as for prototyping. The library has a natural hierarchy comprising basic objects such as the alphabet and strings, data structures to manipulate these objects, and powerful algorithmic techniques driving these data structures. Furthermore, it has the natural taxonomy imposed by the underlying string processing tasks (such as static/dynamic, off-line/on-line, exact/approximate). We believe that our architecture presents a unified view of string processing encompassing recently developed techniques and insights-- this may be of independent interest to those who seek an introduction to this field. Our design is preliminary and we hope to refine it based on feedback.
Solving Classical String Problems on Compressed Texts
"... Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing LZ-compressed texts. The main result is a new algorithm for pattern matc ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Here we study the complexity of string problems as a function of the size of a program that generates input. We consider straight-line programs (SLP), since all algorithms on SLP-generated strings could be applied to processing LZ-compressed texts. The main result is a new algorithm for pattern matching when both a text T and a pattern P are presented by SLPs (so-called fully compressed pattern matching problem). We show how to find a first occurrence, count all occurrences, check whether any given position is an occurrence or not in time O(n 2 m). Here m, n are the sizes of straight-line programs generating correspondingly P and T. Then we present polynomial algorithms for computing fingerprint table and compressed representation of all covers (for the first time) and for finding periods of a given compressed string (our algorithm is faster than previously known). On the other hand, we show that computing the Hamming distance between two SLP-generated strings is NP- and coNP-hard. I.

