Results 1  10
of
14
Transducers and repetitions
 Theoretical Computer Science
, 1986
"... Abstract. The factor transducer of a word associates to each of its factors (or subwc~rds) their first occurrence. Optimal bounds on the size of minimal factor transducers together with an algorithm for building them are given. Analogue results and a simple algorithm are given for the case of subseq ..."
Abstract

Cited by 94 (19 self)
 Add to MetaCart
Abstract. The factor transducer of a word associates to each of its factors (or subwc~rds) their first occurrence. Optimal bounds on the size of minimal factor transducers together with an algorithm for building them are given. Analogue results and a simple algorithm are given for the case of subsequential suffix transducers. Algorithms are applied to repetition searching in words. Rl~sum~. Le transducteur des facteurs d'un mot associe a chacun de ses facteurs leur premiere occurrence. On donne des bornes optimales sur la taille du transducteur minimal d'un mot ainsi qu'un algorithme pour sa construction. On donne des r6sultats analogues et un algorithme simple dans le cas du transducteur souss~luentiel des suffixes d'un mot. On donne une application la d6tection de r6p6titions dans les mots. Contents
Linear Time Algorithms for Finding and Representing all Tandem Repeats in a String
 TREES, AND SEQUENCES: COMPUTER SCIENCE AND COMPUTATIONAL BIOLOGY
, 1998
"... A tandem repeat (or square) is a string ffff, where ff is a nonempty string. We present an O(jSj)time algorithm that operates on the suffix tree T (S) for a string S, finding and marking the endpoint in T (S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
A tandem repeat (or square) is a string ffff, where ff is a nonempty string. We present an O(jSj)time algorithm that operates on the suffix tree T (S) for a string S, finding and marking the endpoint in T (S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents all occurrences of tandem repeats in S, and can be used to efficiently solve many questions concerning tandem repeats and tandem arrays in S. This improves and generalizes several prior efforts to efficiently capture large subsets of tandem repeats.
Optimal Parallel Algorithms for Periods, Palindromes and Squares (Extended Abstract)
, 1992
"... ) Alberto Apostolico Purdue University and Universit`a di Padova Dany Breslauer yyz Columbia University Zvi Galil z Columbia University and TelAviv University Summary of results Optimal concurrentread concurrentwrite parallel algorithms for two problems are presented: ffl Finding all the pe ..."
Abstract

Cited by 32 (13 self)
 Add to MetaCart
) Alberto Apostolico Purdue University and Universit`a di Padova Dany Breslauer yyz Columbia University Zvi Galil z Columbia University and TelAviv University Summary of results Optimal concurrentread concurrentwrite parallel algorithms for two problems are presented: ffl Finding all the periods of a string. The period of a string can be computed by previous efficient parallel algorithms only if it is shorter than half of the length of the string. Our new algorithm computes all the periods in optimal O(log log n) time, even if they are longer. The algorithm can be used to compute all initial palindromes of a string within the same bounds. ffl Testing if a string is squarefree. We present an optimal O(log log n) time algorithm for testing if a string is squarefree, improving the previous bound of O(log n) given by Apostolico [1] and Crochemore and Rytter [12]. We show matching lower bounds for the optimal parallel algorithms that solve the problems above on a general alphab...
Finding maximal pairs with bounded gap
 Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching (CPM), volume 1645 of Lecture Notes in Computer Science
, 1999
"... A pair in a string is the occurrence of the same substring twice. A pair is maximal if the two occurrences of the substring cannot be extended to the left and right without making them different. The gap of a pair is the number of characters between the two occurrences of the substring. In this pape ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
A pair in a string is the occurrence of the same substring twice. A pair is maximal if the two occurrences of the substring cannot be extended to the left and right without making them different. The gap of a pair is the number of characters between the two occurrences of the substring. In this paper we present methods for finding all maximal pairs under various constraints on the gap. In a string of length n we can find all maximal pairs with gap in an upper and lower bounded interval in time O(n log n + z) where z is the number of reported pairs. If the upper bound is removed the time reduces to O(n+z). Since a tandem repeat is a pair where the gap is zero, our methods can be seen as a generalization of finding tandem repeats. The running time of our methods equals the running time of well known methods for finding tandem repeats.
An Optimal O(log log n) Time Parallel Algorithm for Detecting all Squares in a String
, 1995
"... An optimal O(log log n) time concurrentread concurrentwrite parallel algorithm for detecting all squares in a string is presented. A tight lower bound shows that over general alphabets this is the fastest possible optimal algorithm. When p processors are available the bounds become \Theta(d n ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
An optimal O(log log n) time concurrentread concurrentwrite parallel algorithm for detecting all squares in a string is presented. A tight lower bound shows that over general alphabets this is the fastest possible optimal algorithm. When p processors are available the bounds become \Theta(d n log n p e + log log d1+p=ne 2p). The algorithm uses an optimal parallel stringmatching algorithm together with periodicity properties to locate the squares within the input string.
Efficient String Algorithmics
, 1992
"... Problems involving strings arise in many areas of computer science and have numerous practical applications. We consider several problems from a theoretical perspective and provide efficient algorithms and lower bounds for these problems in sequential and parallel models of computation. In the sequ ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
Problems involving strings arise in many areas of computer science and have numerous practical applications. We consider several problems from a theoretical perspective and provide efficient algorithms and lower bounds for these problems in sequential and parallel models of computation. In the sequential setting, we present new algorithms for the string matching problem improving the previous bounds on the number of comparisons performed by such algorithms. In parallel computation, we present tight algorithms and lower bounds for the string matching problem, for finding the periods of a string, for detecting squares and for finding initial palindromes.
Finding maximal quasiperiodicities in strings
 In Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching (CPM
, 2000
"... Abstract. Apostolico and Ehrenfeucht defined the notion of a maximal quasiperiodic substring and gave an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log 2 n). In this paper we give an algorithm that finds all maximal quasiperiodic substrings in a str ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Abstract. Apostolico and Ehrenfeucht defined the notion of a maximal quasiperiodic substring and gave an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log 2 n). In this paper we give an algorithm that finds all maximal quasiperiodic substrings in a string of length n in time O(n log n) andspaceO(n). Our algorithm uses the suffix tree as the fundamental data structure combined with efficient methods for merging and performing multiple searches in search trees. Besides finding all maximal quasiperiodic substrings, our algorithm also marks the nodes in the suffix tree that have a superprimitive pathlabel. 1
An Optimal OnLine Algorithm To Compute All The Covers Of A String
, 1998
"... Let x denote a given nonempty string of length n = jxj. A substring u of x is a cover of x if and only if every position of x lies within an occurrence of u within x. This paper extends the work of Moore and Smyth on computing the covers of a string: our algorithm computes all the covers of every ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Let x denote a given nonempty string of length n = jxj. A substring u of x is a cover of x if and only if every position of x lies within an occurrence of u within x. This paper extends the work of Moore and Smyth on computing the covers of a string: our algorithm computes all the covers of every prefix of x in time \Theta(n). 1 INTRODUCTION Let x denote a nonempty string of length n 1. A substring u of x is called a cover of x if and only if x can be constructed from concatenated and/or overlapped copies of u, so that every position of x lies within an occurrence of u within x. Thus x is always a cover of itself; if a proper substring u of x is also a cover of x, then u is called a proper cover of x. For example, the string x = abcabcaabca has covers x and abca, and proper cover abca. A string that has a proper cover is called coverable. The idea of a coverable string generalizes the idea of a repetition; that is, a string that can be constructed from concatenated copies of so...
Efficient String Matching on Coded Texts
 In Proceedings of Combinatorial Pattern Matching, 6th Annual Symposium (CPM'95
, 1994
"... The so called "four Russians technique" is often used to speed up algorithms by encoding several data items in a single memory cell. Given a sequence of n symbols over a constant size alphabet, one can encode the sequence into O(n=) memory cells in O(log ) time using n= log processors. This paper ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The so called "four Russians technique" is often used to speed up algorithms by encoding several data items in a single memory cell. Given a sequence of n symbols over a constant size alphabet, one can encode the sequence into O(n=) memory cells in O(log ) time using n= log processors. This paper presents an efficient CRCWPRAM stringmatching algorithm for coded texts that takes O(log log(m=)) time 1 making only O(n=) operations, an improvement by a factor of = O(logn) on the number of operations used in previous algorithms. Using this stringmatching algorithm one can test if a string is squarefree and find all palindromes in a string in O(log log n) time using n= log log n processors. 1 Introduction In the stringmatching problem one is searching for occurrences of a pattern string P[1::m] in a text string T [1::n]. There exist several O(n + m) time sequential stringmatching algorithms that are used in a large variety of applications. Galil [23] published the first efficient...