Results 1  10
of
12
Shiftor string matching with superalphabets
 In: Proceedings of the 9th International Symposium Symposium on String Processing and Information Retrieval (SPIRE'2002). LNCS 2476
, 2002
"... Given a text T[1...n] and a pattern P[1...m] over some alphabet Σ of size σ, we want to find all the (exact) occurrences of P in T. The wellknown shiftor algorithm solves this problem in time O(n⌈m/w⌉), where w is the number of bits in machine word, using bitparallelism. We show how to extend the ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
(Show Context)
Given a text T[1...n] and a pattern P[1...m] over some alphabet Σ of size σ, we want to find all the (exact) occurrences of P in T. The wellknown shiftor algorithm solves this problem in time O(n⌈m/w⌉), where w is the number of bits in machine word, using bitparallelism. We show how to extend the bitparallelism in another direction, using superalphabets. This gives a speedup by a factor s, where s is the number of characters processed simultaneously. The algorithm is implemented, and we show that it works well in practice too. The result is the fastest known algorithm for exact string matching for short patterns and small alphabets. Key words: Algorithms, bitparallelism, string matching
Optimal exact and fast approximate two dimensional pattern matching allowing rotations
 In Proc. 13th Annual Symposium on Combinatorial Pattern Matching (CPM 2002), LNCS 2373
, 2002
"... Abstract. We give fast filtering algorithms to search for a 2 dimensional pattern in a 2dimensional text allowing any rotation of the pattern. We consider the cases of exact and approximate matching under several matching models, improving the previous results. For a text of size n \Theta n charac ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We give fast filtering algorithms to search for a 2 dimensional pattern in a 2dimensional text allowing any rotation of the pattern. We consider the cases of exact and approximate matching under several matching models, improving the previous results. For a text of size n \Theta n characters and a pattern of size m \Theta m characters, the exact matching takes average time O(n2 log m=m2), which is optimal. If we allow k mismatches of characters, then our best algorithm achieves O(n2k log m=m2) average time, for reasonable k values. For large k, we obtain an O(n2k3=2 p log m=m) average time algorithm. We generalize
Efficient Evaluation of Parameterized Pattern Queries
 In Proc. Intl. Conf. on Information and Knowledge Management (CIKM
, 2005
"... Many applications rely on sequence databases and use extensively patternmatching queries to retrieve data of interest. This paper extends the traditional patternmatching expressions to parameterized patterns, featuring variables. Parameterized patterns are more expressive and allow to define conci ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Many applications rely on sequence databases and use extensively patternmatching queries to retrieve data of interest. This paper extends the traditional patternmatching expressions to parameterized patterns, featuring variables. Parameterized patterns are more expressive and allow to define concisely regular expressions that would be very complex to describe without variables. They can also be used to express additional constraints on patterns' variables.
Bitparallel algorithms for exact circular string matching
 The Computer Journal
, 2014
"... In this paper, we deal with the exact circular string matching problem (abbreviated as ECSM). Given a string P = p1p2 · · ·pm, a string P(i) = pipi+1 · · ·pmp1 · · ·pi−1, for 1 ≤ i ≤ m, is a circular string of P. Given a text string T = t1t2 · · · tn and a pattern P, the ECSM problem is to fin ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
In this paper, we deal with the exact circular string matching problem (abbreviated as ECSM). Given a string P = p1p2 · · ·pm, a string P(i) = pipi+1 · · ·pmp1 · · ·pi−1, for 1 ≤ i ≤ m, is a circular string of P. Given a text string T = t1t2 · · · tn and a pattern P, the ECSM problem is to find all occurrences of P(i) in text T for 1 ≤ i ≤ m. This paper proposes two algorithms that perform searching of a circular string on text using the bitparallel technique. Our algorithms use only the composition of bitwiselogical operations and basic arithmetic operations, and apply this technique to solve the problem. These algorithms are given names CSBNDM and CSBNDNq, respectively. We give several experiments to verify that they have good performance for random strings and DNA sequences.
Sequential and indexed twodimensional combinatorial template matching allowing rotations
 THEORETICAL COMPUTER SCIENCE A
, 2005
"... We present new and faster algorithms to search for a 2dimensional pattern in a 2dimensional text allowing any rotation of the pattern. This has applications such as image databases and computational biology. We consider the cases of exact and approximate matching under several matching models, usi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
We present new and faster algorithms to search for a 2dimensional pattern in a 2dimensional text allowing any rotation of the pattern. This has applications such as image databases and computational biology. We consider the cases of exact and approximate matching under several matching models, using a combinatorial approach that generalizes string matching techniques. We focus on sequential algorithms, where only the pattern can be preprocessed, as well as on indexed algorithms, where the text is preprocessed and an index built on it. On sequential searching we derive averagecase lower bounds and then obtain optimal averagecase algorithms for all the matching models. At the same time, these algorithms are worstcase optimal. On indexed searching we obtain search time polylogarithmic on the text size, as well as sublinear time in general for approximate searching.
Fast Multipattern Matching for Intrusion Detection
 in &quot;13th Annual EICAR Conference (European Institute for Computer AntiVirus Research) CDrom: Best Paper Proceedings&quot;, U. E. GATTIKER (editor)., EICAR Best Paper Proceedings CDrom (ISBN: 8798727168
, 2004
"... M. Rusinowitch is senior researcher at INRIA. He got a Ph.D. in Computer Science at Nancy in 1987. He is now leader of the CASSIS research team of INRIALorraine with about 20 members, whose activities are focused on automated deduction, software verification and security. M. Rusinowitch’s research ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
M. Rusinowitch is senior researcher at INRIA. He got a Ph.D. in Computer Science at Nancy in 1987. He is now leader of the CASSIS research team of INRIALorraine with about 20 members, whose activities are focused on automated deduction, software verification and security. M. Rusinowitch’s research is concerned with the automated detection of flaws in software using symbolic analysis techniques. He is the author or coauthor of more than 22 papers in journals and 50 papers in conference and is the author of a book. He is also cochairman of the next IJCAR conference to be held in 2004 at Cork and PC member of several events in automated deduction and security.
Faster String Matching With Super{alphabets
 Information Processing Letters
, 2002
"... Given a text T [1 : : : n] and a pattern P [1 : : : m] over some alphabet of size , nding the exact occurrences of P in T requires at least (n log m=m) character comparisons on average, as shown in [19]. Consequently, it is believed that this lower bound implies also an (n log m=m) lower ..."
Abstract
 Add to MetaCart
Given a text T [1 : : : n] and a pattern P [1 : : : m] over some alphabet of size , nding the exact occurrences of P in T requires at least (n log m=m) character comparisons on average, as shown in [19]. Consequently, it is believed that this lower bound implies also an (n log m=m) lower bound for the execution time of an optimal algorithm. However, in this paper we show how to obtain an O(n=m) average time algorithm. This is achieved by slightly changing the model of computation, and with a modi cation of an existing algorithm. Our technique uses a super{alphabet for simulating sux automaton. The space usage of the algorithm is O(m). The technique can be applied to many other string matching algorithms, including dictionary matching, which is also solved in expected time O(n=m), and approximate matching allowing k edit operations (mismatches, insertions or deletions of characters) . This is solved in expected time O(nk=m) for k O(m= log m).
A Method to Overcome Computer Word Size Limitation in Bitparallel Pattern Matching
"... Abstract. The performance of the pattern matching algorithms based on bitparallelism degrades when the input pattern length exceeds the computer word size. Although several divideandconquer methods have been proposed to overcome that limitation, the resulting schemes are not that much efficient a ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. The performance of the pattern matching algorithms based on bitparallelism degrades when the input pattern length exceeds the computer word size. Although several divideandconquer methods have been proposed to overcome that limitation, the resulting schemes are not that much efficient and hard to implement. This study introduces a new fast bitparallel pattern matching algorithm that is capable of searching patterns of any length in a common bitparallel fashion. The proposed bitparallel length invariant matcher (BLIM) is compared with the ShiftOr and bitparallel nondeterministic matching (BNDM) algorithms along with the standard BoyerMoore and Sunday’s quick search, which are known to be the very fast in general. Benchmarks have been conducted on natural language, DNA sequence, and binary alphabet random texts. Besides the length invariant architecture of the algorithm, experimental results indicate that on the average BLIM is 18%, 44%, and 6 % faster than BNDM, which is accepted as one of the fastest algorithms of this genre, on natural language, DNA sequence and binary random texts respectively. 1
The 27th Workshop on Combinatorial Mathematics and Computation Theory Algorithms for the Hybrid Constrained Longest Common Subsequence Problem
"... We investigate a variant of the longest common subsequence problem. Given two sequences X, Y and two constrained patterns P, Q of lengths m, n, p, and q, respectively, the hybrid constrained longest common subsequence problem is to find a longest common subsequence of X and Y such that the resulting ..."
Abstract
 Add to MetaCart
We investigate a variant of the longest common subsequence problem. Given two sequences X, Y and two constrained patterns P, Q of lengths m, n, p, and q, respectively, the hybrid constrained longest common subsequence problem is to find a longest common subsequence of X and Y such that the resulting LCS is both a supersequence of P and a nonsupersequence of Q. Without loss of generality, assume that m ≤ n. We present a new dynamic programming algorithm for solving this problem in O(mnpq) time and space. We also propose another algorithm by restricting the computation on the positions of matches between X and Y. The latter algorithm requires O(pqr log log n + n log n) time over an infinite alphabet and O((pqr+n) log log n)) time over a finite alphabet, and O(pq(r + n)) space for both cases, where r denotes the total number of matches between X and Y. 1
Abstract — Today’s Network Intrusion Prevention Systems
"... (NIPS) provide an important defense mechanism against security threats. The detection of network attacks utilizes a highspeed pattern matching algorithm that can be implemented in either hardware or software. Adapting a softwarebased pattern matching algorithm to an hardwarebased device is a compl ..."
Abstract
 Add to MetaCart
(Show Context)
(NIPS) provide an important defense mechanism against security threats. The detection of network attacks utilizes a highspeed pattern matching algorithm that can be implemented in either hardware or software. Adapting a softwarebased pattern matching algorithm to an hardwarebased device is a complicated task. This paper presents a cost effective multipattern matching algorithm based on Field Programmable Gate Arrays (FPGAs) and standard RAM. The algorithm achieves linerate speed of several orders of magnitude faster than the current state of the art, while attaining similar accuracy of detection. The algorithm can be easily adapted to operate in hardwarebased NIPS and attain even higher linespeed by utilizing a TCAM memory. I.