Results 1 -
2 of
2
Sequential and Parallel Algorithms for the All-Substrings Longest Common Subsequence Problem
"... Given two strings A and B of lengths n a and n b , respectively, the All-substrings Longest Common Subsequence (ALCS) problem obtains, for any substring B # of B, the length of the longest string that is a subsequence of both A and B # . The sequential algorithm takes O(n a n b ) time and O(n b ) ..."
Abstract
- Add to MetaCart
Given two strings A and B of lengths n a and n b , respectively, the All-substrings Longest Common Subsequence (ALCS) problem obtains, for any substring B # of B, the length of the longest string that is a subsequence of both A and B # . The sequential algorithm takes O(n a n b ) time and O(n b ) space. We present a parallel algorithm for the ALCS on the Coarse Grained Multicomputer (BSP/CGM) model with p < # n a processors, that takes O(n a n b /p) time and O(n b # n a ) space per processor, with O(log p) communication rounds. The proposed algorithm also solves the basic Longest Common Subsequence (LCS) Problem that finds the longest string (and not only its length) that is a subsequence of both A and B. To our knowledge, this is the best BSP/CGM algorithm for the LCS and ALCS problems in the literature.
Efficient Dominant Point Algorithms for the Multiple Longest Common Subsequence (MLCS) Problem
"... Finding the longest common subsequence of multiple strings is a classical computer science problem and has many applications in the areas of bioinformatics and computational genomics. In this paper, we present a new sequential algorithm for the general case of MLCS problem, and its parallel realizat ..."
Abstract
- Add to MetaCart
Finding the longest common subsequence of multiple strings is a classical computer science problem and has many applications in the areas of bioinformatics and computational genomics. In this paper, we present a new sequential algorithm for the general case of MLCS problem, and its parallel realization. The algorithm is based on the dominant point approach and employs a fast divide-and-conquer technique to compute the dominant points. When applied to find a MLCS of 3 strings, our general algorithm is shown to exhibit the same performance as the best existing MLCS algorithm by Hakata and Imai, designed specifically for the case of 3 strings. Moreover, we show that for a general case of more than 3 strings, the algorithm is significantly faster than the best existing sequential approaches, reaching up to 2-3 orders of magnitude faster on the large-size problems. Finally, we propose a parallel implementation of the algorithm. Evaluating the parallel algorithm on a benchmark set of both random and biological sequences reveals a near-linear speed-up with respect to the sequential algorithm.

