Results 1 
9 of
9
Practical Algorithms for TranspositionInvariant StringMatching
"... We consider the problems of (1) longest common subsequence (LCS) of two given strings in the case where the first may be shifted by some constant (that is, transposed) to match the second, and (2) transpositioninvariant text searching using indel distance. These problems have applications in music ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
We consider the problems of (1) longest common subsequence (LCS) of two given strings in the case where the first may be shifted by some constant (that is, transposed) to match the second, and (2) transpositioninvariant text searching using indel distance. These problems have applications in music comparison and retrieval. We introduce two novel techniques to solve these problems efficiently. The first is based on the branch and bound method, the second on bitparallelism. Our branch and bound algorithm computes the longest common transpositioninvariant subsequence (LCTS) in time O((m&sup2;+log log sigma) log sigma) in the best case and O((m&sup2;+log sigma)sigma) in the worst case, where m and sigma, respectively, are the length of the strings and the size of the alphabet. On the other hand, we show that the same problem can be solved by using bitparallelism and thus obtain a speedup of O(w/ log m) over the classical algorithms, where the computer word has w bits. The advantage of this latter algorithm over the present bitparallel ones is that it allows the use of more complex distances, including general integer weights. Since our branch and bound method is very flexible, it can be further improved by combining it with other efficient algorithms such as our novel bitparallel algorithm. We experiment on several combination possibilities and discuss which are the best settings for each of those combinations. Our algorithms are easily extended to other musically relevant cases, such as deltamatching and polyphony (where there are several parallel texts to be considered). We also show how our bitparallel algorithm is adapted to text searching and illustrate its effectiveness in complex cases where the only known competing method is the use of brute force.
Rotation and lighting invariant template matching
 In Proc. 6th Latin American Symposium on Theoretical Informatics (LATIN 2004), LNCS 2976
, 2003
"... We address the problem of searching for a twodimensional pattern in a twodimensional text (or image), such that the pattern can be found even if it appears rotated and it is brighter or darker than its occurrence. Furthermore, we consider approximate matching under several tolerance models. We obt ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We address the problem of searching for a twodimensional pattern in a twodimensional text (or image), such that the pattern can be found even if it appears rotated and it is brighter or darker than its occurrence. Furthermore, we consider approximate matching under several tolerance models. We obtain algorithms that are almost optimal both in the worst and the average cases simultaneously. The complexities we obtain are very close to the best current results for the case where only rotations, but not lighting invariance, are supported. These are the first results for this problem under a combinatorial approach. 1
Bitparallel branch & bound algorithm for transposition invariant LCS, in
 Proc. 11th International Symposium on String Processing and Information Retrieval (SPIRE’04), in: Lecture Notes in Comput. Sci
, 2004
"... ..."
(Show Context)
Sequential and indexed twodimensional combinatorial template matching allowing rotations
 THEORETICAL COMPUTER SCIENCE A
, 2005
"... We present new and faster algorithms to search for a 2dimensional pattern in a 2dimensional text allowing any rotation of the pattern. This has applications such as image databases and computational biology. We consider the cases of exact and approximate matching under several matching models, usi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
We present new and faster algorithms to search for a 2dimensional pattern in a 2dimensional text allowing any rotation of the pattern. This has applications such as image databases and computational biology. We consider the cases of exact and approximate matching under several matching models, using a combinatorial approach that generalizes string matching techniques. We focus on sequential algorithms, where only the pattern can be preprocessed, as well as on indexed algorithms, where the text is preprocessed and an index built on it. On sequential searching we derive averagecase lower bounds and then obtain optimal averagecase algorithms for all the matching models. At the same time, these algorithms are worstcase optimal. On indexed searching we obtain search time polylogarithmic on the text size, as well as sublinear time in general for approximate searching.
Improved Time and Space Complexities for Transposition Invariant String Matching
, 2004
"... Given strings A = a1a2...am and B = b1b2...bn over a finite alphabet Σ ⊂ Z of size O(σ), and a distance d() defined among strings, the transposition invariant version of d() is d t (A,B) = mint∈Z d(A+t,B), where A+t = (a1+t)(a2+t)...(am+t). Distances d() of most interest are Levenshtein distance an ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Given strings A = a1a2...am and B = b1b2...bn over a finite alphabet Σ ⊂ Z of size O(σ), and a distance d() defined among strings, the transposition invariant version of d() is d t (A,B) = mint∈Z d(A+t,B), where A+t = (a1+t)(a2+t)...(am+t). Distances d() of most interest are Levenshtein distance and indel distance (the dual of the Longest Common Subsequence), which can be computed in O(mn) time. Recent algorithms compute d t (A,B) in O(mn log log min(m,n)) time for those distances. In this paper we show how those complexities can be reduced to O(mn log log σ). Furthermore, we reduce the space requirements from O(mn) to O(σ 2 + min(m,n)).
V.: A survey of querybyhumming similarity methods
 In: Conf. on Pervasive Technologies Related to Assistive Environments (PETRA
, 2012
"... Performing similarity search in large databases is a problem of particular interest in many communities, such as music, database, and data mining. Although several solutions have been proposed in the literature that perform well in many application domains, there is no best method to solve this kind ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Performing similarity search in large databases is a problem of particular interest in many communities, such as music, database, and data mining. Although several solutions have been proposed in the literature that perform well in many application domains, there is no best method to solve this kind of problem in a QueryByHumming (QBH) application. In QBH the goal is to find the song(s) most similar to a hummed query in an efficient manner. In this paper, we focus on providing a brief overview of the representations to encode music pieces, and also on the methods that have been proposed for QBH or other similarly defined problems.
O(mn log σ) Time Transposition Invariant LCS Computation
"... Abstract. Given strings A and B of lengths m and n over a finite alphabet Σ ⊂ Z of size O(σ), the length of the longest common transposition invariant subsequence is LCTS(A, B) = maxt∈Z{LCS(A +t, B)}, where A + t = (a1 + t)(a2 + t)...(am + t) and LCS(A + t, B) is the length of the longest common su ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Given strings A and B of lengths m and n over a finite alphabet Σ ⊂ Z of size O(σ), the length of the longest common transposition invariant subsequence is LCTS(A, B) = maxt∈Z{LCS(A +t, B)}, where A + t = (a1 + t)(a2 + t)...(am + t) and LCS(A + t, B) is the length of the longest common subsequence between A + t and B. LCTS(A, B) can be computed naively in O(mn σ) time. We present a simple and easy to implement algorithm obtaining O(mnlog σ) time. We also show that transposition invariant Levenshtein distance can be computed in O(mn √ σ) time. 1
Practical Algorithms for TranspositionInvariant StringMatching ⋆
"... We consider the problems of (1) longest common subsequence (LCS) of two given strings in the case where the first may be shifted by some constant (that is, transposed) to match the second, and (2) transpositioninvariant text searching using indel distance. These problems have applications in music ..."
Abstract
 Add to MetaCart
(Show Context)
We consider the problems of (1) longest common subsequence (LCS) of two given strings in the case where the first may be shifted by some constant (that is, transposed) to match the second, and (2) transpositioninvariant text searching using indel distance. These problems have applications in music comparison and retrieval. We introduce two novel techniques to solve these problems efficiently. The first is based on the branch and bound method, the second on bitparallelism. Our branch and bound algorithm computes the longest common transpositioninvariant subsequence (LCTS) in time O((m 2 +log log σ)log σ) in the best case and O((m 2 +log σ)σ) in the worst case, where m and σ, respectively, are the length of the strings and the size of the alphabet. On the other hand, we show that the same problem can be solved by using bitparallelism and thus obtain a speedup of O(w/log m) over the classical algorithms, where the computer word has w bits. The advantage of this latter algorithm over the present bitparallel ones is that it allows the use of more complex distances, including general integer weights. Since our branch and bound
International Journal of Document Analysis (2005) DOI 10.1007/s1003200501476 REGULAR PAPER
, 2005
"... Abstract A significant portion of currently available documents exist in the form of images, for instance, as scanned documents. Electronic documents produced by scanning and OCR software contain recognition errors. This paper uses an automatic approach to examine the selection and the effectiveness ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract A significant portion of currently available documents exist in the form of images, for instance, as scanned documents. Electronic documents produced by scanning and OCR software contain recognition errors. This paper uses an automatic approach to examine the selection and the effectiveness of searching techniques for possible erroneous terms for query expansion. The proposed method consists of two basic steps. In the first step, confused characters in erroneous words are located and editing operations are applied to create a collection of erroneous errorgrams in the basic unit of the model. The second step uses query terms and errorgrams to generate additional query terms, identify appropriate matching terms, and determine the degree of relevance of retrieved document images to the user’s query, based on a vector space IR model. The proposed approach has been trained on 979 document images to construct about 2,822 errorgrams and tested on 100 scanned Web pages,