Abstract

. In contrast to the extensively studied k-differences problem, in the weighted local similarity search problem one searches for approximate matches of subwords of a pattern and subwords of a text whose lengths exceed a certain threshold. Moreover, arbitrary gap and substitution weights are allowed. In this paper, two new prefilter algorithms for the weighted local similarity search problem are presented. These overcome the disadvantages of a similar filter algorithm devised by Myers. 1 Introduction Given a relatively short pattern (query) P = P [1 : : : m] of length m, a long text (database) T = T [1 : : : n] of length n, threshold k 2 IR, and cost (or weight) function ffi, the weighted approximate string search problem is the problem of finding all subwords t of the text for which the edit distance of P and t w.r.t. the cost function ffi is below threshold k (or edist ffi (P; t) k for short). Sellers [Sel80] provided the classical dynamic programming solution to the problem; its ti...