## Computing longest previous factor in linear time and applications

### Cached

### Download Links

- [hal.inria.fr]
- [hal-enpc.archives-ouvertes.fr]
- [www.csd.uwo.ca]
- DBLP

### Other Repositories/Bibliography

Citations: | 6 - 4 self |

### BibTeX

@MISC{Crochemore_computinglongest,

author = {Maxime Crochemore and Lucian Ilie},

title = {Computing longest previous factor in linear time and applications},

year = {}

}

### OpenURL

### Abstract

Abstract. We give two optimal linear-time algorithms for computing the Longest Previous Factor (LPF) array corresponding to a string w. For any position i in w, LPF[i] gives the length of the longest factor of w starting at position i that occurs previously in w. Several properties and applications of LPF are investigated. They include computing the Lempel-Ziv factorization of a string and detecting all repetitions (runs) in a string in linear time independently of the integer alphabet size. Key words: algorithms design, strings, suffix array, longest common prefix, longest previous factor, Lempel–Ziv factorization, repetitions, runs MSC: 68W05, 68W40, 68R15 1

### Citations

1207 | A Universal Algorithm for Sequential Data Compression
- Ziv, Lempel
- 1977
(Show Context)
Citation Context ... single letter in case this prefix is empty. For our example the Lempel–Ziv factorization is a.b.b.a.abb.baa.ab.ab. The Lempel–Ziv factorization is a basic and powerful technique for text compression =-=[17]-=-. It has many variants used in gzip or PKzip software, and more generally in dictionary compression methods. The Lempel–Ziv factorization is easily computed from LPF. The algorithm is shown in Fig. 5.... |

239 |
On the complexity of finite sequences
- Lempel, Ziv
- 1976
(Show Context)
Citation Context ...’07 conference, see [5]. ⋆⋆ Research supported in part by CNRS. ⋆ ⋆ ⋆ Research supported in part by NSERC. † Corresponding authors2 One important application is computing the Lempel–Ziv factorization =-=[14]-=-. Recently [1] gave a suffix-array-based algorithm for computing Lempel–Ziv factorization. However, their algorithm is essentially a simulation of the suffix tree using the suffix array. The descripti... |

153 | Simple linear work suffix array construction
- Kärkkäinen, Sanders
- 2003
(Show Context)
Citation Context ...in time O(n). The algorithm Compute LPF uses O(n) space. Using the fact that the suffix array of a string of length n over an integer alphabet can be computed in O(n) time by any of the algorithms in =-=[7,9,11,12]-=-, we obtain: Theorem 1. Given a string of length n over an integer alphabet, the LPF and PrevOcc arrays can be computed in time and space O(n). 4 An algorithm using LCP Our second algorithm for comput... |

135 |
Ohlebusch: Replacing suffix trees with enhanced suffix arrays
- Abouelhoda, Kurtz, et al.
(Show Context)
Citation Context ... see [5]. ⋆⋆ Research supported in part by CNRS. ⋆ ⋆ ⋆ Research supported in part by NSERC. † Corresponding authors2 One important application is computing the Lempel–Ziv factorization [14]. Recently =-=[1]-=- gave a suffix-array-based algorithm for computing Lempel–Ziv factorization. However, their algorithm is essentially a simulation of the suffix tree using the suffix array. The description in [1] is v... |

130 | Optimal suffix tree construction with large alphabets
- Farach
(Show Context)
Citation Context ...in time O(n). The algorithm Compute LPF uses O(n) space. Using the fact that the suffix array of a string of length n over an integer alphabet can be computed in O(n) time by any of the algorithms in =-=[7,9,11,12]-=-, we obtain: Theorem 1. Given a string of length n over an integer alphabet, the LPF and PrevOcc arrays can be computed in time and space O(n). 4 An algorithm using LCP Our second algorithm for comput... |

80 | Linear-time longestcommon-prefix computation in suffix arrays and its applications
- Kasai, Lee, et al.
(Show Context)
Citation Context ...it: top(S ).len ← 0 and top(S ).pos ← 7. The correctness of the algorithm follows from the above discussion. It runs in O(n) time because each element of SA is pushed only once on to the stack. Also, =-=[10]-=- gives a very simple linear time algorithm to compute the LCP array. 8 2 9 3 3 1 10 2 0 3 4 (ii) 0 7 3 2 1 6 4 1 2 5 5s6 Compute LPF using LCP(w,SA, LCP) 1. SA[n] ← −1; LCP[n] ← 0 2. push((0,SA[0]), S... |

71 | Space efficient linear time construction of suffix arrays
- Ko, Aluru
(Show Context)
Citation Context ...are omitted. 1 Note that a suffix-tree-based algorithm would compute the leftmost such position in the string whereas our algorithm might produce a different one. For instance, in our example, PrevOcc=-=[12]-=- = 10 but the left most occurrence of ab starts at 0. 3s4 Compute LPF(w,prev < ,prev > ) 1. LPF[0] ← LPF<[0] ← LPF>[0] ← 0 2. for i from 1 to n − 1 do 3. j ← max(LPF<[i − 1] − 1, 0); k ← max(LPF>[i − ... |

52 | Finding maximal repetitions in a word in linear time
- Kolpakov, Kucherov
- 1999
(Show Context)
Citation Context ...adjacent edges are labelled 0 and 1, corresponding to the longest common prefixes of suf13 with suf prev< [i] = suf4 and suf prev> [i] = suf7, respectively. Therefore, the maximum of the two gives LPF=-=[13]-=- = 1. On the other hand, the minimum of thes8 2 2 9 3 3 1 12 1 1 2 10 2 0 3 4 0 (i) 13 0 0 0 1 7 3 0 2 2 11 1 1 1 6 4 1 2 5 Fig.3. (i) Solid edges form the graph representing SA and LCP for the text a... |

46 |
Algorithms on Strings
- Crochemore, Hancart, et al.
- 2007
(Show Context)
Citation Context ...n dictionary compression methods. The Lempel–Ziv factorization is easily computed from LPF. The algorithm is shown in Fig. 5. For the example text abbaabbbaaabab in Fig. 1, the algorithm outputs lz = =-=[0, 1, 2, 3, 4, 7, 10, 12]-=-.sLempel–Ziv factorization(w,LPF) 1. lz[0] ← 0; i ← 0 2. while (lz[i] < n) do 3. lz[i + 1] ← lz[i] + max(1,LPF[lz[i]]) 4. i ← i + 1 5. return lz Fig.5. Algorithm for computing Lempel–Ziv factorization... |

33 | Linear time algorithms for finding and representing all the tandem repeats in a string
- Gusfield, Stoye
- 1998
(Show Context)
Citation Context ...ring [3], computing all leftmost maximal periodicites [15], computing all runs [13], computing of all local periods of a string [6], and computing all primitively-rooted squares occurring in a string =-=[8]-=-. In particular, we get: Theorem 3. The runs of a string of length n over an integer alphabet can be computed in O(n) time. 7s8 7 Acknowledgements We warmly thank Bill Smyth and Simon Puglisi for inte... |

20 |
Constructing suffix arrays in linear time
- Kim, Sim, et al.
- 2005
(Show Context)
Citation Context ...in time O(n). The algorithm Compute LPF uses O(n) space. Using the fact that the suffix array of a string of length n over an integer alphabet can be computed in O(n) time by any of the algorithms in =-=[7,9,11,12]-=-, we obtain: Theorem 1. Given a string of length n over an integer alphabet, the LPF and PrevOcc arrays can be computed in time and space O(n). 4 An algorithm using LCP Our second algorithm for comput... |

16 |
Su'x arrays: A new method for on-line search
- Manber, Myers
- 1993
(Show Context)
Citation Context ...phabet A that is an integer interval of size no more than n c , for some constant c. The suffix of w starting at position i is denoted by sufi = w[i..n − 1], for 0 ≤ i ≤ n − 1. The suffix array of w, =-=[16]-=-, denoted SA, gives the suffixes of w sorted ascendingly in lexicographical order, that is, suf SA[0] < suf SA[1] < · · · < suf SA[n−1]. The suffix array of the string abbaabbbaaabab is shown in the s... |

8 |
Fast and practical algorithms for computing all the runs in a string
- Chen, Puglisi, et al.
- 2007
(Show Context)
Citation Context ... [1] is very brief but it seems that their approach can be used to achieve similar goals with ours, nevertheless in a significantly more complicated way. Simultaneously and independently of our work, =-=[2]-=- gave an algorithm that is similar with our second one. Our first algorithm is more general and our approach for the second gives a clearer explanation as well as more insight into the structure of LP... |

3 | Linear-time computation of local periods
- Duval, Kolpakov, et al.
- 2004
(Show Context)
Citation Context ... vertex 12 and so on. Fig. 3(ii) shows the graph after having considered the vertices 13, 12, and 11. It is clear that we need not consider the vertices in this order. For instance, we can compute LPF=-=[6]-=- right away. Precisely, any vertex which is a “peak” in our graph can have its LPF value computed. In the algorithm in Fig. 4 we consider the vertices in the order they appear in the SA (that is, left... |

3 |
Detecting lefmost maximal periodicities
- Main
- 1989
(Show Context)
Citation Context ...y string. A general repetition has the form we , for any rational exponent e ≥ 2 such that e|w| is an integer; e.g., (aabab) 7 5 = aababaa. Particularly important turned out to be maximal repetitions =-=[15]-=- or runs. A run is an occurrence of a repetition that cannot be extended. As an example, the string aababaabba contains the runs aa at positions 0 and 5, ababa, and bb. Runs allow the encoding of all ... |

1 |
Computing local periodicities
- Crochemore, Ilie
- 2007
(Show Context)
Citation Context ...t often appears in the complexity. Our algorithms use suffix arrays, are much simpler, and their complexity is alphabet independent. ⋆ This work has been presented at the AutoMathA’07 conference, see =-=[5]-=-. ⋆⋆ Research supported in part by CNRS. ⋆ ⋆ ⋆ Research supported in part by NSERC. † Corresponding authors2 One important application is computing the Lempel–Ziv factorization [14]. Recently [1] gave... |