Results 1 - 10
of
18
Alphabet Independent And Dictionary Scaled Matching
, 1996
"... The rapidly growing need for analysis of digitized images in multimedia systems has lead to a variety of interesting problems in multidimensional pattern matching. One of the problems is that of scaled matching, finding all appearances of a pattern in a text in all sizes. Another important proble ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
The rapidly growing need for analysis of digitized images in multimedia systems has lead to a variety of interesting problems in multidimensional pattern matching. One of the problems is that of scaled matching, finding all appearances of a pattern in a text in all sizes. Another important problem is dictionary matching, quick search through a dictionary of preprocessed patterns in order to find all dictionary patterns that appear in the input text. In this paper we provide a simple algorithm for two dimensional scaled matching. Our algorithm is the first linear-time alphabet-independent scaled matching algorithm. Its running time is O(jT j), where jT j is the text size, and is independent of j\Sigmaj, the size of the alphabet. The main idea behind our algorithm is identifying and exploiting a scaling-invariant property of patterns. Our technique generalizes to produce the first known algorithm for scaled dictionary matching. We can find all appearances of all dictionary pa...
Two-Dimensional Periodicity in Rectangular Arrays
- Proc. of the 3rd ACM-SIAM Symposium on Discrete Algorithms
, 1992
"... String matching is rich with a variety of algorithmic tools. In contrast, multidimensional matching has had a rather sparse set of techniques. This paper presents a new algorithmic technique for two-dimensional matching: periodicity analysis. Its strength appears to lie in the fact that it is inhere ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
String matching is rich with a variety of algorithmic tools. In contrast, multidimensional matching has had a rather sparse set of techniques. This paper presents a new algorithmic technique for two-dimensional matching: periodicity analysis. Its strength appears to lie in the fact that it is inherently two-dimensional. Periodicity in strings has been used to solve string matching problems. Multidimensional periodicity, though, is not as simple as it is in strings and was not formally studied or used in pattern matching. In this paper, we define and analyze two-dimensional periodicity in rectangular arrays. One definition of string periodicity is that a periodic string can self-overlap in a particular way. An analogous concept is true in two dimensions. The self overlap vectors of a rectangle generate a regular pattern of locations where the rectangle may originate. Based on this regularity, we define four categories of periodic arrays: non-periodic, lattice-periodic, line-periodic and...
Two and Higher Dimensional Pattern Matching in Optimal Expected Time
, 1994
"... Algorithms with optimal expected running time are presented for searching the occurrences of a two-dimensional m × m pattern P in a two-dimensional n × n text T over an alphabet of size c. The algorithms are based on placing in the text a static grid of test points, determined only by n, ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Algorithms with optimal expected running time are presented for searching the occurrences of a two-dimensional m × m pattern P in a two-dimensional n × n text T over an alphabet of size c. The algorithms are based on placing in the text a static grid of test points, determined only by n, m and c (not dynamically by earlier test results). Using test strings read from the test points the algorithms eliminate as many potential occurrences of P as possible. The remaining potential occurrences are separately checked for actual occurrences. A suitable choice of the test point set leads to algorithms with expected running time O(n 2 log c m 2 =m 2 ) using the uniform Bernoulli model of randomness. This is shown to be optimal by a generalization of a one-dimensional lower bound result by Yao. Experimental results show that the algorithms are efficient in practice, too. The method is also generalized for the k mismatches problem. The resulting algorithm has expected running ti...
On the Comparison Complexity of the String Prefix-Matching Problem
- In Proc. 2nd European Symposium on Algorithms, number 855 in Lecture Notes in Computer Science
, 1995
"... In this paper we study the exact comparison complexity of the string prefix-matching problem in the deterministic sequential comparison model with equality tests. We derive almost tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line prefix-matchi ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this paper we study the exact comparison complexity of the string prefix-matching problem in the deterministic sequential comparison model with equality tests. We derive almost tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line prefix-matching algorithms for any fixed pattern and variable text. Unlike previous results on the comparison complexity of string-matching and prefix-matching algorithms, our bounds are almost tight for any particular pattern. We also consider the special case where the pattern and the text are the same string. This problem, which we call the string self-prefix problem, is similar to the pattern preprocessing step of the Knuth-Morris-Pratt stringmatching algorithm that is used in several comparison efficient stringmatching and prefix-matching algorithms, including in our new algorithm. We obtain roughly tight lower and upper bounds on the number of symbol comparisons required in the worst case...
Fast Parallel String Prefix-Matching
- Theoret. Comput. Sci
, 1992
"... An O(log log m) time n log m log log m -processor CRCW-PRAM algorithm for the string prefix-matching problem over a general alphabet is presented. The algorithm can also be used to compute the KMP failure function in O(log log m) time on m log m log log m processors. These results improve on th ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
An O(log log m) time n log m log log m -processor CRCW-PRAM algorithm for the string prefix-matching problem over a general alphabet is presented. The algorithm can also be used to compute the KMP failure function in O(log log m) time on m log m log log m processors. These results improve on the running time of the best previous algorithm for both problems, which was O(log m), while preserving the same number of operations. 1 Introduction String matching is the problem of finding all occurrences of a short pattern string P[1::m] in a longer text string T [1::n]. The classical sequential algorithm of Knuth, Morris and Pratt [12] solves the string matching problem in time that is linear in the length of the input strings. The Knuth-Morris-Pratt [12] string matching algorithm can be easily generalized to find the longest pattern prefix that starts at each text position within the same time bound. We refer to this problem as string prefix-matching. In parallel, the string matching p...
Optimal exact and fast approximate two dimensional pattern matching allowing rotations
- In Proc. 13th Annual Symposium on Combinatorial Pattern Matching (CPM 2002), LNCS 2373
, 2002
"... Abstract. We give fast filtering algorithms to search for a 2- dimensional pattern in a 2-dimensional text allowing any rotation of the pattern. We consider the cases of exact and approximate matching under several matching models, improving the previous results. For a text of size n \Theta n charac ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. We give fast filtering algorithms to search for a 2- dimensional pattern in a 2-dimensional text allowing any rotation of the pattern. We consider the cases of exact and approximate matching under several matching models, improving the previous results. For a text of size n \Theta n characters and a pattern of size m \Theta m characters, the exact matching takes average time O(n2 log m=m2), which is optimal. If we allow k mismatches of characters, then our best algorithm achieves O(n2k log m=m2) average time, for reasonable k values. For large k, we obtain an O(n2k3=2 p log m=m) average time algorithm. We generalize
Multidimensional Pattern Matching: A Survey
, 1992
"... We review some recent algorithms motivated by computer vision. The problem inspiring this research is that of searching an aerial photograph for all appearances of some object. The issues we discuss are local errors, scaling, compression and dictionary matching. We review deterministic serial te ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We review some recent algorithms motivated by computer vision. The problem inspiring this research is that of searching an aerial photograph for all appearances of some object. The issues we discuss are local errors, scaling, compression and dictionary matching. We review deterministic serial techniques that are used for multidimensional pattern matching and discuss their strengths and weaknesses. College of Computing Georgia Institute of Technology Atlanta, Georgia 30332--0280 Paritally supported by NSF grant IRI-9013055. 1 Motivation String Matching is one of the most widely studied problems in computer science [Gal85]. Part of its appeal is in its direct applicability to "real world" problems. The Knuth-Morris-Pratt [KMP77] algorithm is directly implemented in the emacs "s" and UNIX "grep" commands. The longest common subsequence dynamic programming algorithm [CKK72] is implemented in the UNIX "diff" command. The largest overlap heuristic for finding the shortest common s...
Approximate parameterized matching
- In Proc. 12th European Symposium on Algorithms (ESA
, 2004
"... Abstract Two equal length strings s and s0, over alphabets \Sigma s and \Sigma s0, parameterize match if thereexists a bijection ss: \Sigma s! \Sigma s0, such that ss(s) = s0, where ss(s) is the renaming of each characterof s via ss. Parameterized matching is the problem of finding all parameterize ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract Two equal length strings s and s0, over alphabets \Sigma s and \Sigma s0, parameterize match if thereexists a bijection ss: \Sigma s! \Sigma s0, such that ss(s) = s0, where ss(s) is the renaming of each characterof s via ss. Parameterized matching is the problem of finding all parameterized matches of apattern string p in a text t and approximate parameterized matching is the problem of finding,at each location, a bijection ss that maximizes the number of characters that are mapped from p to the appropriate |p|-length substring of t.Parameterized matching was introduced as a model for software duplication detection in software maintenance systems and also has applications in image processing and computationalbiology. For example, approximate parameterized matching models image searching with variable color maps in the presence of errors.We consider the problem for which an error threshold, k, is given and the goal is to find alllocations in t for which there exists a bijection ss which maps p into the appropriate |p|-lengthsubstring of t with at most k mismatched mapped-elements.We show that (1) the approximate parameterized matching, when | p|=|t|, is equivalent tothe maximum matching problem on graphs, implying that (2) maximum matching is reducible to the approximate parameterized matching with threshold k, up till an O(log |t|) factor (thiscan be achieved by reducing approximate parameterized matching to the problem by using a binary search on the k's). Given the best known maximum matching algorithms an O(m1.5),where m = |p | = |t|, is implied for approximate parameterized matching. We show that (3) forthe k threshold problem we can do this in O(m + k1.5).Our main result (4) is an O(nk1.5 + mk log m) time algorithm where m = |p | and n = |t|. 1 Introduction In the traditional pattern matching model [11, 19], one seeks exact occurrences of a given pattern pin a text t, i.e. text locations where every text symbol is equal to its corresponding pattern symbol.For two equal length strings
Sequential and indexed two-dimensional combinatorial template matching allowing rotations
- THEORETICAL COMPUTER SCIENCE A
, 2005
"... We present new and faster algorithms to search for a 2-dimensional pattern in a 2-dimensional text allowing any rotation of the pattern. This has applications such as image databases and computational biology. We consider the cases of exact and approximate matching under several matching models, usi ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We present new and faster algorithms to search for a 2-dimensional pattern in a 2-dimensional text allowing any rotation of the pattern. This has applications such as image databases and computational biology. We consider the cases of exact and approximate matching under several matching models, using a combinatorial approach that generalizes string matching techniques. We focus on sequential algorithms, where only the pattern can be preprocessed, as well as on indexed algorithms, where the text is preprocessed and an index built on it. On sequential searching we derive average-case lower bounds and then obtain optimal average-case algorithms for all the matching models. At the same time, these algorithms are worst-case optimal. On indexed searching we obtain search time polylogarithmic on the text size, as well as sublinear time in general for approximate searching.
Efficient String Matching on Coded Texts
- In Proceedings of Combinatorial Pattern Matching, 6th Annual Symposium (CPM'95
, 1994
"... The so called "four Russians technique" is often used to speed up algorithms by encoding several data items in a single memory cell. Given a sequence of n symbols over a constant size alphabet, one can encode the sequence into O(n=) memory cells in O(log ) time using n= log processors. This paper ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The so called "four Russians technique" is often used to speed up algorithms by encoding several data items in a single memory cell. Given a sequence of n symbols over a constant size alphabet, one can encode the sequence into O(n=) memory cells in O(log ) time using n= log processors. This paper presents an efficient CRCW-PRAM string-matching algorithm for coded texts that takes O(log log(m=)) time 1 making only O(n=) operations, an improvement by a factor of = O(logn) on the number of operations used in previous algorithms. Using this stringmatching algorithm one can test if a string is square-free and find all palindromes in a string in O(log log n) time using n= log log n processors. 1 Introduction In the string-matching problem one is searching for occurrences of a pattern string P[1::m] in a text string T [1::n]. There exist several O(n + m) time sequential string-matching algorithms that are used in a large variety of applications. Galil [23] published the first efficient...

