Results 1  10
of
50
Transducers and repetitions
 Theoretical Computer Science
, 1986
"... Abstract. The factor transducer of a word associates to each of its factors (or subwc~rds) their first occurrence. Optimal bounds on the size of minimal factor transducers together with an algorithm for building them are given. Analogue results and a simple algorithm are given for the case of subseq ..."
Abstract

Cited by 94 (19 self)
 Add to MetaCart
Abstract. The factor transducer of a word associates to each of its factors (or subwc~rds) their first occurrence. Optimal bounds on the size of minimal factor transducers together with an algorithm for building them are given. Analogue results and a simple algorithm are given for the case of subsequential suffix transducers. Algorithms are applied to repetition searching in words. Rl~sum~. Le transducteur des facteurs d'un mot associe a chacun de ses facteurs leur premiere occurrence. On donne des bornes optimales sur la taille du transducteur minimal d'un mot ainsi qu'un algorithme pour sa construction. On donne des r6sultats analogues et un algorithme simple dans le cas du transducteur souss~luentiel des suffixes d'un mot. On donne une application la d6tection de r6p6titions dans les mots. Contents
Parameterized Pattern Matching: Algorithms and Applications
, 1994
"... The problem of finding sections of code that either are identical or are related by the systematic renaming of variables or constants can be modeled in terms of parameterized strings (pstrings) and parameterized matches (p matches) [Baker93a]. Pstrings are strings over two alphabets, one of whic ..."
Abstract

Cited by 71 (5 self)
 Add to MetaCart
The problem of finding sections of code that either are identical or are related by the systematic renaming of variables or constants can be modeled in terms of parameterized strings (pstrings) and parameterized matches (p matches) [Baker93a]. Pstrings are strings over two alphabets, one of which represents parameters. Two pstrings are a parameterized match (pmatch) if one pstring is obtained by renaming the parameters of the other by a onetoone function. In this paper, we investigate parameterized pattern matching via parameterized suffix trees (psuffix trees), defined in [Baker93a]. We give two algorithms for constructing psuffix trees: one (eager) that runs in linear time for fixed alphabets, and another that uses auxiliary data structures and runs in O(nlog (n)) time for variable alphabets, where n is input length. We show that using a psuffix tree for a pattern pstring P, it is possible to search for all pmatches of P within a text pstring T in space linear in ï P ï...
Finding Maximal Repetitions in a Word in Linear Time
 In Symposium on Foundations of Computer Science
, 1999
"... A repetition in a word is a subword with the period of at most half of the subword length. We study maximal repetitions occurring in, that is those for which any extended subword of has a bigger period. The set of such repetitions represents in a compact way all repetitions in.We first prove a combi ..."
Abstract

Cited by 50 (4 self)
 Add to MetaCart
A repetition in a word is a subword with the period of at most half of the subword length. We study maximal repetitions occurring in, that is those for which any extended subword of has a bigger period. The set of such repetitions represents in a compact way all repetitions in.We first prove a combinatorial result asserting that the sum of exponents of all maximal repetitions of a word of length is bounded by a linear function in. This implies, in particular, that there is only a linear number of maximal repetitions in a word. This allows us to construct a lineartime algorithm for finding all maximal repetitions. Some consequences and applications of these results are discussed, as well as related works. 1.
Algorithms for Discovering Repeated Patterns in Multidimensional Representations of Polyphonic Music
, 2003
"... In this paper we give an overview of four algorithms that we have developed for pattern matching, pattern discovery and data compression in multidimensional datasets. We show that these algorithms can fruitfully be used for processing musical data. In particular, we show that our algorithms can disc ..."
Abstract

Cited by 47 (14 self)
 Add to MetaCart
In this paper we give an overview of four algorithms that we have developed for pattern matching, pattern discovery and data compression in multidimensional datasets. We show that these algorithms can fruitfully be used for processing musical data. In particular, we show that our algorithms can discover instances of perceptually signifrant musica 1 repetition that cannot be found using previous approaches. We also describe results that suggest the possibility of using our datacompression algorithm for modelling expert motivicthematic music analysis.
Linear Time Algorithms for Finding and Representing all Tandem Repeats in a String
 TREES, AND SEQUENCES: COMPUTER SCIENCE AND COMPUTATIONAL BIOLOGY
, 1998
"... A tandem repeat (or square) is a string ffff, where ff is a nonempty string. We present an O(jSj)time algorithm that operates on the suffix tree T (S) for a string S, finding and marking the endpoint in T (S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
A tandem repeat (or square) is a string ffff, where ff is a nonempty string. We present an O(jSj)time algorithm that operates on the suffix tree T (S) for a string S, finding and marking the endpoint in T (S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents all occurrences of tandem repeats in S, and can be used to efficiently solve many questions concerning tandem repeats and tandem arrays in S. This improves and generalizes several prior efforts to efficiently capture large subsets of tandem repeats.
Optimal Parallel Algorithms for Periods, Palindromes and Squares (Extended Abstract)
, 1992
"... ) Alberto Apostolico Purdue University and Universit`a di Padova Dany Breslauer yyz Columbia University Zvi Galil z Columbia University and TelAviv University Summary of results Optimal concurrentread concurrentwrite parallel algorithms for two problems are presented: ffl Finding all the pe ..."
Abstract

Cited by 32 (13 self)
 Add to MetaCart
) Alberto Apostolico Purdue University and Universit`a di Padova Dany Breslauer yyz Columbia University Zvi Galil z Columbia University and TelAviv University Summary of results Optimal concurrentread concurrentwrite parallel algorithms for two problems are presented: ffl Finding all the periods of a string. The period of a string can be computed by previous efficient parallel algorithms only if it is shorter than half of the length of the string. Our new algorithm computes all the periods in optimal O(log log n) time, even if they are longer. The algorithm can be used to compute all initial palindromes of a string within the same bounds. ffl Testing if a string is squarefree. We present an optimal O(log log n) time algorithm for testing if a string is squarefree, improving the previous bound of O(log n) given by Apostolico [1] and Crochemore and Rytter [12]. We show matching lower bounds for the optimal parallel algorithms that solve the problems above on a general alphab...
Optimal Parallel Suffix Tree Construction
, 1997
"... An O(m)work, O(m)space, O(log m)time CREWPRAM algorithm for constructing the suffix tree of a string s of length m drawn from any fixed alphabet set is obtained. This is the first known work and space optimal parallel algorithm for this problem. It can be generalized to a string s drawn fr ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
An O(m)work, O(m)space, O(log m)time CREWPRAM algorithm for constructing the suffix tree of a string s of length m drawn from any fixed alphabet set is obtained. This is the first known work and space optimal parallel algorithm for this problem. It can be generalized to a string s drawn from any general alphabet set to perform in O(log m) time and O(m log j\Sigmaj) work and space, after the characters in s have been sorted alphabetically, where j\Sigmaj is the number of distinct characters in s. In this case too, the algorithm is workoptimal.
Suffix Trees and their Applications in String Algorithms
, 1993
"... : The suffix tree is a compacted trie that stores all suffixes of a given text string. This data structure has been intensively employed in pattern matching on strings and trees, with a wide range of applications, such as molecular biology, data processing, text editing, term rewriting, interpreter ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
: The suffix tree is a compacted trie that stores all suffixes of a given text string. This data structure has been intensively employed in pattern matching on strings and trees, with a wide range of applications, such as molecular biology, data processing, text editing, term rewriting, interpreter design, information retrieval, abstract data types and many others. In this paper, we survey some applications of suffix trees and some algorithmic techniques for their construction. Special emphasis is given to the most recent developments in this area, such as parallel algorithms for suffix tree construction and generalizations of suffix trees to higher dimensions, which are important in multidimensional pattern matching. Work partially supported by the ESPRIT BRA ALCOM II under contract no. 7141 and by the Italian MURST Project "Algoritmi, Modelli di Calcolo e Strutture Informative". y Part of this work was done while the author was visiting AT&T Bell Laboratories. Email: grossi@di.uni...
A characterization of the Squares in a Fibonacci string
 THEORETICAL COMPUTER SCIENCE
"... A (finite) Fibonacci string F n is defined as follows: F 0 = b, F 1 = a; for every integer n 2, F n = F n\Gamma1 F n\Gamma2 . For n 1, the length of F n is denoted by f n = jF n j. The infinite Fibonacci string F is the string which contains every F n , n 1, as a prefix. Apart from their general ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
A (finite) Fibonacci string F n is defined as follows: F 0 = b, F 1 = a; for every integer n 2, F n = F n\Gamma1 F n\Gamma2 . For n 1, the length of F n is denoted by f n = jF n j. The infinite Fibonacci string F is the string which contains every F n , n 1, as a prefix. Apart from their general theoretical importance, Fibonacci strings are often cited as worst case examples for algorithms which compute all the repetitions or all the "Abelian squares" in a given string. In this paper we provide a characterization of all the squares in F , hence in every prefix F n ; this characterization naturally gives rise to a \Theta(f n ) algorithm which specifies all the squares of F n in an appropriate encoding. This encoding is made possible by the fact that the squares of F n occur consecutively, in "runs", the number of which is \Theta(f n ). By contrast, the known general algorithms for the computation of the repetitions in an arbitrary string require \Theta(f n log f n ) time (and pro...