Results 1  10
of
42
Dictionary matching and indexing with errors and don't cares
 In STOC '04
, 2004
"... ..."
(Show Context)
kmismatch with don’t cares
 In ESA
, 2007
"... Abstract. We give the first nontrivial algorithms for the kmismatch pattern matching problem with don’t cares. Given a text t of length n and a pattern p of length m with don’t care symbols and a bound k,our algorithms find all the places that the pattern matches the text with at most k mismatches ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
(Show Context)
Abstract. We give the first nontrivial algorithms for the kmismatch pattern matching problem with don’t cares. Given a text t of length n and a pattern p of length m with don’t care symbols and a bound k,our algorithms find all the places that the pattern matches the text with at most k mismatches.WefirstgiveanO(n(k +lognlog log n)logm)time randomised solution which finds the correct answer with high probability. We then present a new deterministic O(nk 2 log 3 m) time solution that uses tools developed for group testing and finally an approach based on kselectors that runs in O(nk polylog m) time but requires O(poly m) time preprocessing. In each case, the location of the mismatches at each alignment is also given at no extra cost. 1
A fast, randomised, maximal subset matching algorithm for documentlevel music retrieval
 Ministry of Energy, Telecommunications and Posts
, 2006
"... We present MSM, a new maximal subset matching algorithm, for MIR at score level with polyphonic texts and patterns. First, we argue that the problem MSM and its ancestors, the SIA family of algorithms, solve is 3SUMhard and, therefore, subquadratic solutions must involve approximation. MSM is such ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
We present MSM, a new maximal subset matching algorithm, for MIR at score level with polyphonic texts and patterns. First, we argue that the problem MSM and its ancestors, the SIA family of algorithms, solve is 3SUMhard and, therefore, subquadratic solutions must involve approximation. MSM is such a solution; we describe it, and argue that, at O(n log n) time with no large constants, it is orders of magnitude more timeefficient than its closest competitor. We also evaluate MSM’s performance on a retrieval problem addressed by the OMRAS project, and show that it outperforms OMRAS on this task by a considerable margin.
Finding patterns with variable length gaps or don’t cares
 of Lecture Notes in Computer Science
, 2006
"... Abstract. In this paper we have presented new algorithms to handle the pattern matching problem where the pattern can contain variable length gaps. Given a pattern P with variable length gaps and a text T our algorithm works in O(n + m + α log(max1<=i<=l(bi − ai))) time where n is the length o ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we have presented new algorithms to handle the pattern matching problem where the pattern can contain variable length gaps. Given a pattern P with variable length gaps and a text T our algorithm works in O(n + m + α log(max1<=i<=l(bi − ai))) time where n is the length of the text, m is the summation of the lengths of the component subpatterns, α is the total number of occurrences of the component subpatterns in the text and ai and bi are, respectively, the minimum and maximum number of don’t cares allowed between the ith and (i+1)st component of the pattern. We also present another algorithm which, given a suffix array of the text, can report whether P occurs in T in O(m + α log log n) time. Both the algorithms record information to report all the occurrences of P in T. Furthermore, the techniques used in our algorithms are shown to be useful in many other contexts. 1
Finding patterns in given intervals
 of Lecture Notes in Computer Science
, 2007
"... Abstract. In this paper, we study the pattern matching problem in given intervals. Depending on whether the intervals are given a priori for preprocessing, or during the query along with the pattern or, even in both cases, we develop solutions for different variants of this problem. In particular, ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper, we study the pattern matching problem in given intervals. Depending on whether the intervals are given a priori for preprocessing, or during the query along with the pattern or, even in both cases, we develop solutions for different variants of this problem. In particular, we present efficient indexing schemes for each of the above variants of the problem. 1
Tree Pattern Matching to Subset Matching in Linear Time
 IN SIAM J. ON COMPUTING
, 2000
"... This paper is the first of two papers describing an O (n polylog(m)) time algorithm for the Tree Pattern Matching problem on a pattern of size m and a text of size n. In this paper, we show an O(n+m) time Turing reduction from the Tree Pattern Matching problem to another problem called the Subset Ma ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
This paper is the first of two papers describing an O (n polylog(m)) time algorithm for the Tree Pattern Matching problem on a pattern of size m and a text of size n. In this paper, we show an O(n+m) time Turing reduction from the Tree Pattern Matching problem to another problem called the Subset Matching problem. The second paper will give efficient deterministic and randomized algorithms for the Subset Matching problem. Together,these two papers will imply an O(n log³ m + m)time deterministic algorithm and an O (n (log³m/log log m)+m) time randomized algorithm for the Tree Pattern Matching problem.
Sweepline the Music!
 Computer Science in Perspective
, 2003
"... The problem of matching sets of points or sets of horizontal line segments in plane under translations is considered. For finding the exact occurrences of a point set of size m within another point set of size n we give an algorithm with running time O(mn), and for finding partial occurrences an ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
The problem of matching sets of points or sets of horizontal line segments in plane under translations is considered. For finding the exact occurrences of a point set of size m within another point set of size n we give an algorithm with running time O(mn), and for finding partial occurrences an algorithm with running time O(mnlogm). To find the largest overlap between two line segment patterns we develop an algorithm with running time O(mnlog(mn)). All algorithms are based on a simple sweepline traversal of one of the patterns in the lexicographic order. The motivation for the problems studied comes from music retrieval and analysis.
Flexible music retrieval in sublinear time
 IN PROC. 10TH PRAGUE STRINGOLOGY CONFERENCE (PSC'05)
, 2005
"... Music sequences can be treated as texts in order to perform music retrieval tasks on them. However, the text search problems that result from this modeling are unique to music retrieval. Up to date, several approaches derived from classical string matching have been proposed to cope with the new s ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Music sequences can be treated as texts in order to perform music retrieval tasks on them. However, the text search problems that result from this modeling are unique to music retrieval. Up to date, several approaches derived from classical string matching have been proposed to cope with the new search problems, yet each problem had its own algorithms. In this paper we show that a technique recently developed for multipattern approximate string matching is flexible enough to be successfully extended to solve many different music retrieval problems, as well as combinations thereof not addressed before. We show that the resulting algorithms are close to optimal and much better than existing approaches in many practical cases.
A Black Box for Online Approximate Pattern Matching
"... Abstract. We present a deterministic black box solution for online approximate matching. Given a pattern of length m and a streaming text of length n that arrives one character at a time, the task is to report the distance between the pattern and a sliding window of the text as soon as the new chara ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We present a deterministic black box solution for online approximate matching. Given a pattern of length m and a streaming text of length n that arrives one character at a time, the task is to report the distance between the pattern and a sliding window of the text as soon as the new character arrives. Our solution requires O(Σ log2 m j=1 T (n, 2j−1)/n) time for each input character, where T (n, m) is the total running time of the best offline algorithm. The types of approximation that are supported include exact matching with wildcards, matching under the Hamming norm, approximating the Hamming norm, kmismatch and numerical measures such as the L2 and L1 norms. For these examples, the resulting online algorithms take O(log 2 m), O ( √ m log m), O(log 2 m/ɛ 2), O ( √ k log k log m), O(log 2 m)andO ( √ m log m) time per character respectively. The space overhead is O(m) which we show is optimal. 1