## Improved Approximate Pattern Matching on Hypertext (1998)

Venue: | In Proc. LATIN'98, LNCS 1380 |

Citations: | 5 - 2 self |

### BibTeX

@INPROCEEDINGS{Navarro98improvedapproximate,

author = {Gonzalo Navarro},

title = {Improved Approximate Pattern Matching on Hypertext},

booktitle = {In Proc. LATIN'98, LNCS 1380},

year = {1998},

pages = {352--357},

publisher = {Springer-Verlag}

}

### OpenURL

### Abstract

. The problem of approximate pattern matching on hypertext is defined and solved by Amir et al. in O(m(n log m + e)) time, where m is the length of the pattern, n is the total text size and e is the total number of edges. Their space complexity is O(mn). We present a new algorithm which is O(mk(n + e)) time and needs only O(n) extra space, where k ! m is the number of allowed errors in the pattern. If the graph is acyclic, our time complexity drops to O(m(n + e)), improving Amir's results. 1 Introduction Approximate string matching problems appear in a number of important areas related to string processing: text searching, pattern recognition, computational biology, audio processing, etc. The edit distance between two strings a and b, ed(a; b), is defined as the minimum number of edit operations that must be carried out to make them equal. The allowed operations are insertion, deletion and substitution of characters in a or b. The problem of approximate string matching is defined as...

### Citations

682 |
Hypertext: an introduction and survey
- Conklin
- 1987
(Show Context)
Citation Context ...ns. For the particular case of ed(), a number of algorithms have been presented to improve the worst case to O(kn) or the average case, e.g. [8, 13, 5, 12, 4, 14, 15, 3] Pattern matching on hypertext =-=[6]-=- has been considered only recently. The model is that the text forms a graph of N nodes and E edges, where a string is stored inside each node, and the edges indicate alternative texts that may follow... |

318 |
Fast text search allowing errors
- Manber, Wu
- 1992
(Show Context)
Citation Context ...on is the most flexible to allow different distance functions. For the particular case of ed(), a number of algorithms have been presented to improve the worst case to O(kn) or the average case, e.g. =-=[8, 13, 5, 12, 4, 14, 15, 3]-=- Pattern matching on hypertext [6] has been considered only recently. The model is that the text forms a graph of N nodes and E edges, where a string is stored inside each node, and the edges indicate... |

234 |
The theory and computation of evolutionary distances: Pattern recognition
- Sellers
- 1980
(Show Context)
Citation Context ...o pattern is at most k. That is, report all text positions j such that there is a suffix x of text[1::j] such that ed(x; patt)sk. The classical solution is O(mn) time and involves dynamic programming =-=[11]-=-. This solution is the most flexible to allow different distance functions. For the particular case of ed(), a number of algorithms have been presented to improve the worst case to O(kn) or the averag... |

149 |
Finding approximate patterns in strings
- Ukkonen
- 1985
(Show Context)
Citation Context ...on is the most flexible to allow different distance functions. For the particular case of ed(), a number of algorithms have been presented to improve the worst case to O(kn) or the average case, e.g. =-=[8, 13, 5, 12, 4, 14, 15, 3]-=- Pattern matching on hypertext [6] has been considered only recently. The model is that the text forms a graph of N nodes and E edges, where a string is stored inside each node, and the edges indicate... |

139 | A fast bit-vector algorithm for approximate string matching based on dynamic programming
- Myers
- 1999
(Show Context)
Citation Context ...on is the most flexible to allow different distance functions. For the particular case of ed(), a number of algorithms have been presented to improve the worst case to O(kn) or the average case, e.g. =-=[7,13,4,14,15,3,9]-=- Pattern matching on hypertext [5] has been considered only recently. The model is that the text forms a graph of N nodes and E edges, where a string is stored inside each node, and the edges indicate... |

53 | G.: A faster algorithm for approximate string matching - Baeza-Yates, Navarro - 1996 |

53 |
Fast string matching with k differences
- Landau, Vishkin
- 1988
(Show Context)
Citation Context ...on is the most flexible to allow different distance functions. For the particular case of ed(), a number of algorithms have been presented to improve the worst case to O(kn) or the average case, e.g. =-=[8, 13, 5, 12, 4, 14, 15, 3]-=- Pattern matching on hypertext [6] has been considered only recently. The model is that the text forms a graph of N nodes and E edges, where a string is stored inside each node, and the edges indicate... |

51 |
On using q-gram locations in approximate string matching
- Sutinen, Tarhio
- 1995
(Show Context)
Citation Context |

48 |
Theoretical and empirical comparisons of approximate string matching algorithms
- Chang, Lampe
- 1992
(Show Context)
Citation Context |

48 | A subquadratic algorithm for approximate limited expression matching
- Wu, Manber, et al.
- 1996
(Show Context)
Citation Context |

31 |
Fast and practical approximate pattern matching
- Baeza-Yates, Perleberg
- 1996
(Show Context)
Citation Context |

27 | Episode matching
- Das, Fleischer, et al.
- 1997
(Show Context)
Citation Context ...not only motivated by the structure of the World-Wide-Web and the possibility to search sequences of elements across paths of references, but also because graphs model naturally complex processes. In =-=[7]-=- it is considered the possibility of using approximate string matching as a model for data mining, where the symbols are in fact events and sequences of interesting events (perhaps separated by uninte... |

6 |
A linear time pattern matching algorithm between a string and a tree
- Akutsu
- 1993
(Show Context)
Citation Context ...to transform any hypertext to that form, by ending the node at its first reference). They solve the problem for an acyclic graph in O(N + mE + R log log m) (where R is the size of the answer). Akutsu =-=[1]-=- solved the problem of exact pattern matching on a hypertext which has a tree structure in O(n) time, while Park and Kim [10] extended this result to an O(n + mE) algorithm for directed acyclic graphs... |

6 |
Approximate string matching with arbitrary costs for text and hypertext
- Manber, Wu
- 1992
(Show Context)
Citation Context ...nces of events), and we may want to identify potentially dangerous sequences of events in the process under analysis. The first attempt to define pattern matching on hypertext is due to Manber and Wu =-=[9]-=-, which view a hypertext as a graph of files with no links inside (it is easy to transform any hypertext to that form, by ending the node at its first reference). They solve the problem for an acyclic... |

4 | Pattern matching in hypertext
- Amir, Lewenstein, et al.
(Show Context)
Citation Context ...time, while Park and Kim [10] extended this result to an O(n + mE) algorithm for directed acyclic graphs and for graphs with cycles where no text node can match the pattern in two places. Amir et al. =-=[2]-=- were the first in considering approximate string matching over hypertext. In this case they consider the graph with n nodes and e edges and want to report all nodes v where in the text graph there is... |

4 |
String matching in hypertext
- Park, Kim
- 1995
(Show Context)
Citation Context ...raph in O(N + mE + R log log m) (where R is the size of the answer). Akutsu [1] solved the problem of exact pattern matching on a hypertext which has a tree structure in O(n) time, while Park and Kim =-=[10]-=- extended this result to an O(n + mE) algorithm for directed acyclic graphs and for graphs with cycles where no text node can match the pattern in two places. Amir et al. [2] were the first in conside... |