## Shift-And Approach to Pattern Matching in LZW Compressed Text (1999)

### Cached

### Download Links

- [www.i.kyushu-u.ac.jp]
- [www.i.kyushu-u.ac.jp]
- [www.i.kyushu-u.ac.jp]
- [ftp.i.kyushu-u.ac.jp]
- [www.i.kyushu-u.ac.jp]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proc. CPM'99, LNCS 1645 |

Citations: | 17 - 6 self |

### BibTeX

@INPROCEEDINGS{Kida99shift-andapproach,

author = {Takuya Kida and Masayuki Takeda and Ayumi Shinohara and Setsuo Arikawa},

title = {Shift-And Approach to Pattern Matching in LZW Compressed Text},

booktitle = {In Proc. CPM'99, LNCS 1645},

year = {1999},

pages = {1--13},

publisher = {Springer-Verlag}

}

### OpenURL

### Abstract

This paper considers the Shift-And approach to the problem of pattern matching in LZW compressed text, and gives a new algorithm that solves it. The algorithm is indeed fast when a pattern length is at most 32, or the word length. After an O(m + |#|)timeandO(|#|) space preprocessing of a pattern, it scans an LZW compressed text in O(n + r) time and reports all occurrences of the pattern, where n is the compressed text length, m is the pattern length, and r is the number of the pattern occurrences. Experimental results show that it runs approximately 1.5 times faster than a decompression followed by a simple search using the Shift-And algorithm. Moreover, the algorithm can be extended to the generalized pattern matching, to the pattern matching with k mismatches, and to the multiple pattern matching, like the Shift-And algorithm.

### Citations

1149 | A.: A universal algorithm for sequential data compression
- Ziv, Lempel
- 1977
(Show Context)
Citation Context ..., and Vishikin [6], and Amir and Benson [2, 3] and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression=-= [18]-=-. Amir, Benson, and Farach [5] addressed the LZW compression [16]. Karpinski, et al. [12]andMiyazaki,et al. [15] addressed the straight-line programs. However, it seems that most of these studies were... |

419 |
A (1984) A technique for high performance data compression
- Welch
(Show Context)
Citation Context ...and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach [5] addressed the LZW compression=-= [16]-=-. Karpinski, et al. [12]andMiyazaki,et al. [15] addressed the straight-line programs. However, it seems that most of these studies were undertaken mainly from the theoretical viewpoint. Concerning the... |

319 |
Fast Text Searching Allowing Errors,” in
- Wu, Manber
- 1992
(Show Context)
Citation Context ...g in compressed text is of great importance since there is a remarkable explosion of machine readable text files, which are often stored in compressed forms. On the other hand, the Shift-And approach =-=[1, 7, 17]-=- to the classical pattern matching is widely known to be efficient in many practical applications. This method is simple, but very fast when a pattern length is not greater than the word length of typ... |

223 | A new approach to text searching
- BAEZA-YATES, GONNET
- 1992
(Show Context)
Citation Context ...g in compressed text is of great importance since there is a remarkable explosion of machine readable text files, which are often stored in compressed forms. On the other hand, the Shift-And approach =-=[1, 7, 17]-=- to the classical pattern matching is widely known to be efficient in many practical applications. This method is simple, but very fast when a pattern length is not greater than the word length of typ... |

104 |
Generalized string matching
- Abrahamson
- 1987
(Show Context)
Citation Context ...g in compressed text is of great importance since there is a remarkable explosion of machine readable text files, which are often stored in compressed forms. On the other hand, the Shift-And approach =-=[1, 7, 17]-=- to the classical pattern matching is widely known to be efficient in many practical applications. This method is simple, but very fast when a pattern length is not greater than the word length of typ... |

96 | Let sleeping files lie: Pattern matching in Zcompressed files
- Amir, Benson, et al.
- 1996
(Show Context)
Citation Context ...d Benson [2, 3] and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach=-= [5]-=- addressed the LZW compression [16]. Karpinski, et al. [12]andMiyazaki,et al. [15] addressed the straight-line programs. However, it seems that most of these studies were undertaken mainly from the th... |

86 |
String matching in Lempel-Ziv compressed strings. Algorithmica
- Thorup
- 1998
(Show Context)
Citation Context ...hkin [8] addressed the run-length compression, and Amir, Landau, and Vishikin [6], and Amir and Benson [2, 3] and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup =-=[9] -=-and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach [5] addressed the LZW compression [16]. Karpinski, et al. [12]andMiyazaki,et al. [15] addressed the straight-... |

80 | Efficient two-dimensional compressed matching
- Amir, Benson
- 1992
(Show Context)
Citation Context ...he combinatorial pattern matching. Several researchers tackled this problem. EilamTzoreff and Vishkin [8] addressed the run-length compression, and Amir, Landau, and Vishikin [6], and Amir and Benson =-=[2, 3] -=-and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach [5] addressed t... |

63 | A text compression scheme that allows fast searching directly in the compressed file
- Manber
- 1997
(Show Context)
Citation Context ...dMiyazaki,et al. [15] addressed the straight-line programs. However, it seems that most of these studies were undertaken mainly from the theoretical viewpoint. Concerning the practical aspect, Manber =-=[14] p-=-ointed out at CPM’94 as follows. It is not clear, for example, whether in practice the compressed search in [5] will indeed be faster than a regular decompression followed by a fast search. In 1998 ... |

56 |
Data structures and algorithms for approximate string matching
- GALIL, GIANCARLO
- 1988
(Show Context)
Citation Context ...m � � P[i] ∋ a � . (2 ′ ) 5.2 Pattern Matching with k Mismatches This problem is a pattern matching problem in which we allow up to k characters of the pattern to mismatch with the correspon=-=ding text [10]-=-. For example, if k = 2, the pattern pattern matches the strings postern and cittern, but does not match eastern. The idea stated in [7] to solve this problem is to count up the number of mismatches u... |

48 |
Efficient pattern matching with scaling
- Amir, Landau, et al.
- 1992
(Show Context)
Citation Context ...t interesting topics in the combinatorial pattern matching. Several researchers tackled this problem. EilamTzoreff and Vishkin [8] addressed the run-length compression, and Amir, Landau, and Vishikin =-=[6],-=- and Amir and Benson [2, 3] and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson... |

38 |
Two-dimensional periodicity and its applications
- Amir, Benson
- 1992
(Show Context)
Citation Context ...he combinatorial pattern matching. Several researchers tackled this problem. EilamTzoreff and Vishkin [8] addressed the run-length compression, and Amir, Landau, and Vishikin [6], and Amir and Benson =-=[2, 3] -=-and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach [5] addressed t... |

23 | Multiple pattern matching in lzw compressed text
- KIDA, TAKEDA, et al.
- 1998
(Show Context)
Citation Context ...t at CPM’94 as follows. It is not clear, for example, whether in practice the compressed search in [5] will indeed be faster than a regular decompression followed by a fast search. In 1998 we gave i=-=n [13]-=- an affirmative answer to the above question: We presented an algorithm for finding multiple patterns in LZW compressed text, which is a variant of the Amir-Benson-Farach algorithm [5], and showed tha... |

22 |
Efficient algorithms for Lempel-Ziv encoding
- Gasieniec, Karpinski, et al.
- 1996
(Show Context)
Citation Context ...length compression, and Amir, Landau, and Vishikin [6], and Amir and Benson [2, 3] and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al.=-= [11]-=- addressed the LZ77 compression [18]. Amir, Benson, and Farach [5] addressed the LZW compression [16]. Karpinski, et al. [12]andMiyazaki,et al. [15] addressed the straight-line programs. However, it s... |

21 | An improved pattern matching algorithm for strings in terms of straight-line programs
- Miyazaki, Shinohara, et al.
(Show Context)
Citation Context ...rsion. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach [5] addressed the LZW compression [16]. Karpinski, et al. [12]andMiyazaki,et al.=-= [15]-=- addressed the straight-line programs. However, it seems that most of these studies were undertaken mainly from the theoretical viewpoint. Concerning the practical aspect, Manber [14] pointed out at C... |

20 |
Matching patterns in a string subject to multi-linear transformation, Theoretical Computer Science 60
- Eilam-Tzoreff, Vishkin
- 1988
(Show Context)
Citation Context ...thm. 1 Introduction Pattern matching in compressed text is one of the most interesting topics in the combinatorial pattern matching. Several researchers tackled this problem. EilamTzoreff and Vishkin =-=[8]-=- addressed the run-length compression, and Amir, Landau, and Vishikin [6], and Amir and Benson [2, 3] and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and ... |

18 |
An efficient pattern-matching algorithm for strings with short descriptions
- Karpinski, Rytter, et al.
- 1997
(Show Context)
Citation Context ... its two-dimensional version. Farach and Thorup [9] and G¸asieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach [5] addressed the LZW compression [16]. Karpinski, et al.=-= [12]-=-andMiyazaki,et al. [15] addressed the straight-line programs. However, it seems that most of these studies were undertaken mainly from the theoretical viewpoint. Concerning the practical aspect, Manbe... |

13 |
Let sleeping lie: Pattern matching in Z-compressed
- Amir, Benson, et al.
- 1994
(Show Context)
Citation Context ...d Benson [2, 3] and Amir, Benson, and Farach [4] addressed its two-dimensional version. Farach and Thorup [9] and Gasieniec, et al. [11] addressed the LZ77 compression [18]. Amir, Benson, and Farach [=-=5]-=- addressed the LZW compression [16]. Karpinski, et al. [12] and Miyazaki, et al. [15] addressed the straight-line programs. However, it seems that most of these studies were undertaken mainly from the... |