Results 11  20
of
22
Sequence Comparison: Some Theory and Some Practice
, 1988
"... A brief survey of the theory and practice of sequence comparison is made focusing on diff, the UNIX 1 file difference utility. 1 Sequence comparison Sequence comparison is a deep and fascinating subject in Computer Science, both theoretical and practical. However, in our opinion, neither the theo ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
A brief survey of the theory and practice of sequence comparison is made focusing on diff, the UNIX 1 file difference utility. 1 Sequence comparison Sequence comparison is a deep and fascinating subject in Computer Science, both theoretical and practical. However, in our opinion, neither the theoretical nor the practical aspects of the problem are well understood and we feel that their mastery is a true challenge for Computer Science. The central problem can be stated very easily: find an algorithm, as efficient and practical as possible, to compute a longest common subsequence (lcs for short) of two given sequences 2 . As usual, a subsequence of a sequence is another sequence obtained from it by deleting some (not necessarily contiguous) terms. Thus, both en/pri and en/pai are longest common subsequences of sequence/comparison and theory/and/practice. Part of this work was done while the author was visiting the Universit'e de Rouen, in 1987. That visit was partially supported...
New Algorithms for the Longest Common Subsequence Problem
, 1994
"... Given two sequences A = a 1 a 2 : : : am and B = b 1 b 2 : : : b n , m n, over some alphabet \Sigma, a common subsequence C = c 1 c 2 : : : c l of A and B is a sequence that can be obtained from both A and B by deleting zero or more (not necessarily adjacent) symbols. Finding a common subsequenc ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Given two sequences A = a 1 a 2 : : : am and B = b 1 b 2 : : : b n , m n, over some alphabet \Sigma, a common subsequence C = c 1 c 2 : : : c l of A and B is a sequence that can be obtained from both A and B by deleting zero or more (not necessarily adjacent) symbols. Finding a common subsequence of maximal length is called the Longest CommonSubsequence (LCS) Problem. Two new algorithms based on the wellknown paradigm of computing minimal matches are presented. One runs in time O(ns+minfds; pmg) and the other runs in time O(ns +minfp(n \Gamma p); pmg) where s = j\Sigmaj is the alphabet size, p is the length of a longest common subsequence and d is the number of minimal matches. The ns term is charged by a standard preprocessing phase. When m n both algorithms are fast in situations when a LCS is expected to be short as well as in situations when a LCS is expected to be long. Further they show a much smaller degeneration in intermediate situations, especially the second al...
Speedingup Hirschberg and HuntSzymanski LCS Algorithms
, 2003
"... Two algorithms are presented that solve the problem of recovering the longest common subsequence of two strings. The first algorithm is an improvement of Hirschberg’s divideandconquer algorithm. The second algorithm is an improvement of HuntSzymanski algorithm based on an efficient computation of ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Two algorithms are presented that solve the problem of recovering the longest common subsequence of two strings. The first algorithm is an improvement of Hirschberg’s divideandconquer algorithm. The second algorithm is an improvement of HuntSzymanski algorithm based on an efficient computation of all dominant match points. These two algorithms use bitvector operations and are shown to work very efficiently in practice.
Bounds on the number of longest common subsequences
"... This paper performs the analysis necessary to bound the running time of known, efficient algorithms for generating all longest common subsequences. That is, we bound the running time as a function of input size for algorithms with time essentially proportional to the output size. This paper consider ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper performs the analysis necessary to bound the running time of known, efficient algorithms for generating all longest common subsequences. That is, we bound the running time as a function of input size for algorithms with time essentially proportional to the output size. This paper considers both the case of computing all distinct LCSs and the case of computing all LCS embeddings. Also included is an analysis of how much better the efficient algorithms are than the standard method of generating LCS embeddings. A full analysis is carried out with running times measured as a function of the total number of input characters, and much of the analysis is also provided for cases in which the two input sequences are of the same specified length or of two independently specified lengths.
Fast and simple computation of all longest common subsequences
 Eprint arXiv:cs.DS/0211001, Comp. Sci. Res. Repository
, 2002
"... This paper shows that a simple algorithm produces the allprefixesLCSsgraph in O(mn) time for two input sequences of size m and n. Given any prefix p of the first input sequence and any prefix q of the second input sequence, all longest common subsequences (LCSs) of p and q can be generated in tim ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
This paper shows that a simple algorithm produces the allprefixesLCSsgraph in O(mn) time for two input sequences of size m and n. Given any prefix p of the first input sequence and any prefix q of the second input sequence, all longest common subsequences (LCSs) of p and q can be generated in time proportional to the output size, once the allprefixesLCSsgraph has been constructed. The problem can be solved in the context of generating all the distinct character strings that represent an LCS or in the context of generating all ways of embedding an LCS in the two input strings.
A New Practical Linear Space Algorithm for the Longest Common Subsequence Problem
"... This paper deals with a new practical method for solving the longest common subsequence (LCS) problem. Given two strings of lengths m and n, m, on an alphabet of size s, we first present an algorithm which determines the length p of an LCS in O(ns + min{mp, p(n p)}) time and O(ns) space. ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This paper deals with a new practical method for solving the longest common subsequence (LCS) problem. Given two strings of lengths m and n, m, on an alphabet of size s, we first present an algorithm which determines the length p of an LCS in O(ns + min{mp, p(n p)}) time and O(ns) space.
A Scalable and Efficient Systolic Algorithm for the Longest Common Subsequence Problem
 Journal of Information Science and Engineering
, 2002
"... this paper, a scalable and efficient systolic algorithm is presented. For two given strings of length m and n,wherem # n,the algorithm can solve the LCS problem in m +2r  1 (respectively n +2r  1) time steps with r < n/2 (respectively r < m/2) processors. Experimental results show that the al ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
this paper, a scalable and efficient systolic algorithm is presented. For two given strings of length m and n,wherem # n,the algorithm can solve the LCS problem in m +2r  1 (respectively n +2r  1) time steps with r < n/2 (respectively r < m/2) processors. Experimental results show that the algorithm can be faster on multicomputers than all the previous systolic algorithms for the same problem
Algorithms for Two Versions of LCS Problem for Indeterminate Strings ⋆
"... Abstract. We study the complexity of the longest common subsequence (LCS) problem from a new perspective. By an indeterminate string (istring, in short) we mean a sequence e X = e X[1] e X[2]... e X[n], where eX[i] ⊆ Σ for each i, and Σ is a given alphabet of potentially large size. A subsequence o ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. We study the complexity of the longest common subsequence (LCS) problem from a new perspective. By an indeterminate string (istring, in short) we mean a sequence e X = e X[1] e X[2]... e X[n], where eX[i] ⊆ Σ for each i, and Σ is a given alphabet of potentially large size. A subsequence of e X is any usual string over Σ which is an element of the finite (but usually of exponential size) language e X[i1] e X[i2]... e X[ip], where 1 ≤ i1 < i2 < i3... < ip ≤ n, p ≥ 0. Similarly, we define a supersequence of x. Our first version of the LCS problem is Problem ILCS: for given istrings e X and e Y, find their longest common subsequence. From the complexity point of view, new parameters of the input correspond to Σ  and maximum size ℓ of the subsets in e X and e Y. There is also a third parameter R, which gives a measure of similarity between e X and eY. The smaller the R, the lesser is the time for solving Problem ILCS. Our second version of the LCS problem is Problem CILCS (constrained ILCS): for given istrings e X and e Y and a plain string Z, find the longest
String comparison by transposition networks
, 903
"... Abstract. Computing string or sequence alignments is a classical method of comparing strings and has applications in many areas of computing, such as signal processing and bioinformatics. Semilocal string alignment is a recent generalisation of this method, in which the alignment of a given string ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Computing string or sequence alignments is a classical method of comparing strings and has applications in many areas of computing, such as signal processing and bioinformatics. Semilocal string alignment is a recent generalisation of this method, in which the alignment of a given string and all substrings of another string are computed simultaneously at no additional asymptotic cost. In this paper, we show that there is a close connection between semilocal string alignment and a certain class of traditional comparison networks known as transposition networks. The transposition network approach can be used to represent different string comparison algorithms in a unified form, and in some cases provides generalisations or improvements on existing algorithms. This approach allows us to obtain new algorithms for sparse semilocal string comparison and for comparison of highly similar and highly dissimilar strings, as well as of runlength compressed strings. We conclude that the transposition network method is a very general and flexible way of understanding and improving different string comparison algorithms, as well as their efficient implementation. 1
Using fuzzy linguistic summaries for the comparison of time series: an application to the analysis of investment fund quotations
 IFSAEUSFLAT
, 2009
"... We propose a new, human consistent method for the evaluation of similarity of time series that uses a fuzzy quantifier base aggregation of trends (segments), within the authors’ (cf. Kacprzyk, Wilbik, Zadro˙zny [1, 2, 3, 4, 5, 6] or Kacprzyk, Wilbik [7, 8, 9]) approach to the linguistic summarizatio ..."
Abstract
 Add to MetaCart
We propose a new, human consistent method for the evaluation of similarity of time series that uses a fuzzy quantifier base aggregation of trends (segments), within the authors’ (cf. Kacprzyk, Wilbik, Zadro˙zny [1, 2, 3, 4, 5, 6] or Kacprzyk, Wilbik [7, 8, 9]) approach to the linguistic summarization of trends based on Zadeh’s protoforms and fuzzy logic with linguistic quantifiers. The results obtain are very intuitively appealing and justified by valuable outcomes of similarity analyses between quotations of an investment fund and the two main indexes of the Warsaw Stock Exchange.