Results 1 
7 of
7
Efficient pattern matching on binary strings
 In Current Trends in Theory and Practice of Computer Science
, 2009
"... The binary string matching problem is an interesting problem in computer science, since binary data are omnipresent in telecom and computer network applications. The main reason for using binaries is size. A binary is a much more compact format than the symbolic or textual representation of the same ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
The binary string matching problem is an interesting problem in computer science, since binary data are omnipresent in telecom and computer network applications. The main reason for using binaries is size. A binary is a much more compact format than the symbolic or textual representation of the same information.
FAST INDEX BASED FILTERS FOR MUSIC RETRIEVAL
 ISMIR 2008 – SESSION 5D – MIR METHODS
, 2008
"... We consider two contentbased music retrieval problems where the music is modeled as sets of points in the Euclidean plane, formed by the (onset time, pitch) pairs. We introduce fast filtering methods based on indexing the underlying database. The filters run in a sublinear time in the length of th ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
We consider two contentbased music retrieval problems where the music is modeled as sets of points in the Euclidean plane, formed by the (onset time, pitch) pairs. We introduce fast filtering methods based on indexing the underlying database. The filters run in a sublinear time in the length of the database, and they are lossless if a quadratic space may be used. By taking into account the application, the search space can be narrowed down, obtaining practically lossless filters using linear size index structures. For the checking phase, which dominates the overall running time, we exploit previously designed algorithms suitable for local checking. In our experiments on a music database, our best filterbased methods performed several orders of a magnitude faster than previous solutions.
Nested counters in bitparallel string matching
 In Proc. 3rd LATA
, 2009
"... Abstract. Many algorithms, e.g. in the field of string matching, are based on handling many counters, which can be performed in parallel, even on a sequential machine, using bitparallelism. The recently presented technique of nested counters (Matryoshka counters) [1] is to handle small counters mos ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Many algorithms, e.g. in the field of string matching, are based on handling many counters, which can be performed in parallel, even on a sequential machine, using bitparallelism. The recently presented technique of nested counters (Matryoshka counters) [1] is to handle small counters most of the time, and refer to larger counters periodically, when the small counters may get full, to prevent overflow. In this work, we present several nontrivial applications of Matryoshka counters in string matching algorithms, improving their worst or averagecase time complexities. The set of problems comprises (δ, α)matching, matching with k insertions, episode matching, and matching under Levenshtein distance. 1
Regular expression matching with multistrings and intervals
 In Proc. SODA’10
"... Regular expression matching is a key task (and often computational bottleneck) in a variety of software tools and applications. For instance, the standard grep and sed utilities, scripting languages such as perl, internet traffic analysis, XML querying, and protein searching. The basic definition of ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Regular expression matching is a key task (and often computational bottleneck) in a variety of software tools and applications. For instance, the standard grep and sed utilities, scripting languages such as perl, internet traffic analysis, XML querying, and protein searching. The basic definition of a regular expression is that we combine characters with union, concatenation, and kleene star operators. The length m is proportional to the number of characters. However, often the initial operation is to concatenate characters in fairly long strings, e.g., if we search for certain combinations of words in a firewall. As a result, the number k of strings in the regular expression is significantly smaller than m. Our main result is a new algorithm that essentially replaces m with k in the complexity bounds for regular expression matching. More precisely, after an O(m log k) time and O(m) space preprocessing of the expression, we can match it in a string presented as a stream log w of characters in O(k w + log k) time per character, where w is the number of bits in a memory word. For large w, this corresponds to the previous best bound log w of O(m w + log m). Prior to this work no O(k) bound per character was known. We further extend our solution to efficiently handle character class interval operators C{x, y}. Here, C is a set of characters and C{x, y}, where x and y are integers such that 0 ≤ x ≤ y, represents a string of length between x and y from C. These character class intervals generalize variable length gaps which are frequently used for pattern matching in computational biology applications. 1
Efficient algorithms for (δ, γ, α)matching
"... Abstract. We propose new algorithms for (δ, γ, α)matching. In this string matching problem we are given a pattern P = p0p1... pm−1 and a text T = t0t1... tn−1 over some integer alphabet Σ = {0... σ − 1}. The pattern symbol pi matches the text symbol tj iff pi − tj  ≤ δ. The pattern P (δ, γ)match ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. We propose new algorithms for (δ, γ, α)matching. In this string matching problem we are given a pattern P = p0p1... pm−1 and a text T = t0t1... tn−1 over some integer alphabet Σ = {0... σ − 1}. The pattern symbol pi matches the text symbol tj iff pi − tj  ≤ δ. The pattern P (δ, γ)matches some text substring tj... tj+m−1 iff for all i it holds that pi − tj+i  ≤ δ and�pi − tj+i  ≤ γ. Finally, in (δ, γ, α)matching we also permit at most α length gaps (text substrings) between each matching text symbol. The only known previous algorithm runs in O(mn) time. We give several algorithms that improve the average case up to O(n) for small α, and the worst case to O(min{mn, Mα}) or O(mn log γ/w), where M = {(i, j)  pi − tj  ≤ δ} and w is the number of bits in a machine word. We conclude with experimental results showing that the algorithms are very efficient in practice. Key words: approximate string matching, music information retrieval, bitparallelism, sparse dynamic programming 1
Filtering Degenerate Patterns with Application to Protein Sequence Analysis
, 2013
"... algorithms ..."
International Journal of Foundations of Computer Science c ○ World Scientific Publishing Company FLEXIBLE MUSIC RETRIEVAL IN SUBLINEAR TIME
"... Communicated by Editor’s name Music sequences can be treated as texts in order to perform music retrieval tasks on them. However, the text search problems that result from this modeling are unique to music retrieval. Up to date, several approaches derived from classical string matching have been pro ..."
Abstract
 Add to MetaCart
Communicated by Editor’s name Music sequences can be treated as texts in order to perform music retrieval tasks on them. However, the text search problems that result from this modeling are unique to music retrieval. Up to date, several approaches derived from classical string matching have been proposed to cope with the new search problems, yet each problem had its own algorithms. In this paper we show that a technique recently developed for multipattern approximate string matching is flexible enough to be successfully extended to solve many different music retrieval problems, as well as combinations thereof not addressed before. We show that the resulting algorithms are averageoptimal in many cases and close to averageoptimal otherwise. Empirically, they are much better than existing approaches in many practical cases.