## Finding Maximal Repetitions in a Word in Linear Time (1999)

### Cached

### Download Links

- [www.lifl.fr]
- [www2.lifl.fr]
- [www.loria.fr]
- [www.lifl.fr]
- DBLP

### Other Repositories/Bibliography

Venue: | In Symposium on Foundations of Computer Science |

Citations: | 50 - 4 self |

### BibTeX

@INPROCEEDINGS{Kolpakov99findingmaximal,

author = {Roman Kolpakov and Gregory Kucherov},

title = {Finding Maximal Repetitions in a Word in Linear Time},

booktitle = {In Symposium on Foundations of Computer Science},

year = {1999},

pages = {596--604},

publisher = {IEEE Computer Society}

}

### Years of Citing Articles

### OpenURL

### Abstract

A repetition in a word is a subword with the period of at most half of the subword length. We study maximal repetitions occurring in, that is those for which any extended subword of has a bigger period. The set of such repetitions represents in a compact way all repetitions in.We first prove a combinatorial result asserting that the sum of exponents of all maximal repetitions of a word of length is bounded by a linear function in. This implies, in particular, that there is only a linear number of maximal repetitions in a word. This allows us to construct a linear-time algorithm for finding all maximal repetitions. Some consequences and applications of these results are discussed, as well as related works. 1.

### Citations

902 |
Algorithms on Strings, Trees and Sequences
- Gusfield
- 1997
(Show Context)
Citation Context ...dicities) in words are fundamental objects, due to their primary importance in word combinatorics [14] as well as in various applications, such as string matching algorithms [8, 5], molecular biology =-=[9]-=-, or text compression [20]. Several notions of repetitions have been studied, and to make it precise, we start with basic definitions. Recall that the period of a word w = a 1 : : : an is the smallest... |

550 |
A space-economical suffix tree construction algorithm
- McCreight
- 1976
(Show Context)
Citation Context ...s of type 1 in a word w, the Main's algorithm proceeds as follows. First compute the s-factorization w = u 1 u 2 : : : u k . This computation can be done in time O(jwj) using suffix tree construction =-=[18, 23]-=-. Then for each i from 2 to k compute, using the above method, the maximal repetitions in word tu, where u is u i and t is the suffix of u 1 : : : u i 1 of length 2ju i 1 j + ju i j. Each such computa... |

330 |
Text Algorithms
- Crochemore, Rytter
- 1994
(Show Context)
Citation Context ...act way, all repetitions in the word, hence their importance. Let us now survey the known algorithmic results on searching for repetitions in a word, which is a classical string matching problem (see =-=[4]-=-). In early 80s, Slisenko [19] proposed a linear (real-time) algorithm for finding all syntactically distinct maximal repetitions in a word. Independently, Crochemore [3] described a simple and elegan... |

329 | On-line construction of suffix trees
- Ukkonen
- 1995
(Show Context)
Citation Context ...s of type 1 in a word w, the Main's algorithm proceeds as follows. First compute the s-factorization w = u 1 u 2 : : : u k . This computation can be done in time O(jwj) using suffix tree construction =-=[18, 23]-=-. Then for each i from 2 to k compute, using the above method, the maximal repetitions in word tu, where u is u i and t is the suffix of u 1 : : : u i 1 of length 2ju i 1 j + ju i j. Each such computa... |

157 |
Data compression: methods and theory
- Storer
- 1988
(Show Context)
Citation Context ...ndamental objects, due to their primary importance in word combinatorics [14] as well as in various applications, such as string matching algorithms [8, 5], molecular biology [9], or text compression =-=[20]-=-. Several notions of repetitions have been studied, and to make it precise, we start with basic definitions. Recall that the period of a word w = a 1 : : : an is the smallest positive integer p such t... |

78 |
An optimal algorithm for computing the repetitions in a word
- Crochemore
- 1981
(Show Context)
Citation Context ... whether a given word contains a square was proposed in [16]. However, it is known that there may be up to n log n) square occurrences in a word, even if only primitivelyrooted squares are considered =-=[2-=-] (an integer power u k is primitively-rooted if u is a primitive word). An example is provided by Fibonacci words, that contain (n log n) squares all of which are primitively-rooted (an exact formula... |

76 |
Optimal off-line detection of repetitions in a string
- Apostolico, Preparata
- 1983
(Show Context)
Citation Context ...re not followed or preceded by another occurrence of u). This is an asymptotically optimal bound, as the number of such powers can be n log n). Using a suffix tree technique, Apostolico and Preparata =-=[1]-=- described an O(n log n) algorithm for finding all right-maximal repetitions, which are repetitions that cannot be extended to the right without increasing the period. Main and Lorentz [15] proposed a... |

76 |
Combinatorics on Words, volume 17 of Encyclopedia of Mathematics
- Lothaire
- 1983
(Show Context)
Citation Context ...lications of these results are discussed, as well as related works. 1. Introduction Repetitions (periodicities) in words are fundamental objects, due to their primary importance in word combinatorics =-=[14]-=- as well as in various applications, such as string matching algorithms [8, 5], molecular biology [9], or text compression [20]. Several notions of repetitions have been studied, and to make it precis... |

70 |
An algorithm for finding all repetitions in a string
- Main, Lorentz
- 1978
(Show Context)
Citation Context ...d Preparata [1] described an O(n log n) algorithm for finding all right-maximal repetitions, which are repetitions that cannot be extended to the right without increasing the period. Main and Lorentz =-=[15]-=- proposed another algorithm which actually finds all maximal repetitions in O(n log n) time. They also point out the optimality of this bound under the assumption of unbounded alphabet and under the r... |

46 | Simple and flexible detection of contiguous repeats using a suffix tree
- Stoye, Gusfield
- 1998
(Show Context)
Citation Context ...s for each position the shortest square starting at this position. He also claims a generalization which finds all primitivelyrooted squares in time O(n + S) where S is the number of such squares. In =-=[21], Sto-=-ye and Gusfield proposed several algorithms that are based on a unified suffix tree framework. Their results are based on an algorithm which finds in time O(n log n) all "branching tandem repeats... |

34 | Linear time algorithms for finding and representing all the tandem repeats in a string
- Gusfield, Stoye
- 2004
(Show Context)
Citation Context ...subwords in a word which are repetitions. Note that in this paper we are interested in characterizing all occurrences of repetitions in the word, and not in all syntactically distinct repetitions (cf =-=[19, 22]-=-). Clearly, a word may contain a quadratic number of repetitions (e.g. a n ). To represent them in a compact way, we introduce the notion of maximal repetition. A maximal repetition 1 in a word is a r... |

24 |
W.: Squares, cubes, and time-space efficient string searching
- Crochemore, Rytter
- 1995
(Show Context)
Citation Context ...duction Repetitions (periodicities) in words are fundamental objects, due to their primary importance in word combinatorics [14] as well as in various applications, such as string matching algorithms =-=[8, 5]-=-, molecular biology [9], or text compression [20]. Several notions of repetitions have been studied, and to make it precise, we start with basic definitions. Recall that the period of a word w = a 1 :... |

23 | How many squares can a string contain
- Fraenkel, Simpson
- 1998
(Show Context)
Citation Context ...torization and suffix tree techniques. The goal achieved is to find, in linear time, a representative of each syntactically distinct square. The feasibility of this task is supported by the result of =-=[6]-=- asserting that there is a linear number (actually, no more than 2n) distinct squares in words of length n over an arbitrary alphabet. The approach allows also to solve some other problems, e.g. to ac... |

22 |
Detecting leftmost maximal periodicities
- Main
- 1989
(Show Context)
Citation Context ...more [3] described a simple and elegant linear algorithm for finding a square in a word (and thus checking if a word is repetition-free). The algorithm 1 called run in [10] and maximal periodicity in =-=[17]-=- is based on a special factorization of the word, called sfactorization (f-factorization in [4], or Lempel-Ziv decomposition [9]). Another linear algorithm checking whether a given word contains a squ... |

20 |
Time-space optimal string matching
- Galil, Seiferas
- 1983
(Show Context)
Citation Context ...duction Repetitions (periodicities) in words are fundamental objects, due to their primary importance in word combinatorics [14] as well as in various applications, such as string matching algorithms =-=[8, 5]-=-, molecular biology [9], or text compression [20]. Several notions of repetitions have been studied, and to make it precise, we start with basic definitions. Recall that the period of a word w = a 1 :... |

19 |
Computation of squares in a string
- Kosaraju
- 1994
(Show Context)
Citation Context ...ch as (primitively- or non-primitively-rooted) squares, cubes, or integer powers. Thus, all these tasks can be done in time O(n+T ) where T is the output size (these bounds have been also obtained in =-=[13, 22]-=- with more sophisticated algorithms). Another example is the set of branching tandem repeats, notion studied in [21]. In our terminology, branching tandem repeats are (not necessarily primitively-root... |

16 | Maximal repetitions in words or how to find all squares in linear time. Rapport Interne LORIA 98-R-227, INRIA-Lorraine/LORIA
- Kolpakov, Kucherov
- 1998
(Show Context)
Citation Context ...njf n j + O(jfn j) ( 1:618 is the golden ratio). Since general words of length n contain O(n log n) primitivelyrooted squares [5], Fibonacci words contain asymptotically maximal number of them. In [1=-=1, 12]-=- we computed the exact number #R(fn ) of maximal repetitions in Fibonacci words, which turned out to be 2jf n 2 j 3 (curiously enough, this number is one less than the number of distinct squares, comp... |

13 |
Recherche linéaire d’un carré dans un mot
- Crochemore
- 1983
(Show Context)
Citation Context ... string matching problem (see [4]). In early 80s, Slisenko [19] proposed a linear (real-time) algorithm for finding all syntactically distinct maximal repetitions in a word. Independently, Crochemore =-=[3]-=- described a simple and elegant linear algorithm for finding a square in a word (and thus checking if a word is repetition-free). The algorithm 1 called run in [10] and maximal periodicity in [17] is ... |

13 | A characterization of the squares in a Fibonacci string
- Iliopoulos, Moore, et al.
- 1997
(Show Context)
Citation Context ...in a word. Independently, Crochemore [3] described a simple and elegant linear algorithm for finding a square in a word (and thus checking if a word is repetition-free). The algorithm 1 called run in =-=[10]-=- and maximal periodicity in [17] is based on a special factorization of the word, called sfactorization (f-factorization in [4], or Lempel-Ziv decomposition [9]). Another linear algorithm checking whe... |

8 |
Linear time recognition of square free strings
- Main, Lorentz
- 1985
(Show Context)
Citation Context ...al factorization of the word, called sfactorization (f-factorization in [4], or Lempel-Ziv decomposition [9]). Another linear algorithm checking whether a given word contains a square was proposed in =-=[16]-=-. However, it is known that there may be up to n log n) square occurrences in a word, even if only primitivelyrooted squares are considered [2] (an integer power u k is primitively-rooted if u is a pr... |

7 |
Detection of periodicities and string-matching in real time
- SLISENKO
- 1983
(Show Context)
Citation Context ...subwords in a word which are repetitions. Note that in this paper we are interested in characterizing all occurrences of repetitions in the word, and not in all syntactically distinct repetitions (cf =-=[19, 22]-=-). Clearly, a word may contain a quadratic number of repetitions (e.g. a n ). To represent them in a compact way, we introduce the notion of maximal repetition. A maximal repetition 1 in a word is a r... |

6 |
The exact number of squares in fibonacci words
- Fraenkel, Simpson
- 1999
(Show Context)
Citation Context ...power u k is primitively-rooted if u is a primitive word). An example is provided by Fibonacci words, that contain (n log n) squares all of which are primitively-rooted (an exact formula is given in [=-=7]-=-). This implies that there is no hope to construct a linear algorithm to explicitly find all squares in a word as their number is super-linear. There are several different O(n log n) algorithms findin... |

1 |
On the sum of exponents of maximal repetitions in a word. Rapport Interne 99-R-034, Laboratoire Lorrain de Recherche en Informatique et ses Applications
- Kolpakov, Kucherov
- 1999
(Show Context)
Citation Context ...njf n j + O(jfn j) ( 1:618 is the golden ratio). Since general words of length n contain O(n log n) primitivelyrooted squares [5], Fibonacci words contain asymptotically maximal number of them. In [1=-=1, 12]-=- we computed the exact number #R(fn ) of maximal repetitions in Fibonacci words, which turned out to be 2jf n 2 j 3 (curiously enough, this number is one less than the number of distinct squares, comp... |

1 |
Recherche linéaire d’un carrédansunmot
- Crochemore
- 1983
(Show Context)
Citation Context ... string matching problem (see [4]). In early 80s, Slisenko [19] proposed a linear (real-time) algorithm for finding all syntactically distinct maximal repetitions in a word. Independently, Crochemore =-=[3]-=- described a simple and elegant linear algorithm for finding a square in a word (and thus checking if a word is repetition-free). The algorithm 1 called run in [10] and maximal periodicity in [17]is ... |