Results 1  10
of
18
Parallel Dynamic Programming for Solving the String Editing Problem on a CGM/BSP
, 2002
"... In this paper we present a coarsegrained parallel algorithm for solving the string edit distance problem for a string A and all substrings of a string C. Our method is based on a novel CGM/BSP parallel dynamic programming technique for computing all highest scoring paths in a weighted grid graph. T ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
In this paper we present a coarsegrained parallel algorithm for solving the string edit distance problem for a string A and all substrings of a string C. Our method is based on a novel CGM/BSP parallel dynamic programming technique for computing all highest scoring paths in a weighted grid graph. The algorithm requires log p rounds/supersteps and O( p log m) local computation, where p is the number of processors, p n. To our knowledge, this is the first efficient CGM/BSP algorithm for the alignment of all substrings of C with A. Furthermore, the CGM/BSP parallel dynamic programming technique presented is of interest in its own right and we expect it to lead to other parallel dynamic programming methods for the CGM/BSP.
Exploring the Viability of the Cell Broadband Engine for Bioinformatics Applications.” IBM
 In Proc. of the IEEE International Workshop on High Performance Computational Biology (HiCOMB’07
, 2007
"... This paper evaluates the performance of bioinformatics applications on the Cell Broadband Engine recently developed at IBM. In particular we focus on two highly popular bioinformatics applications – FASTA and ClustalW. The characteristics of these bioinformatics applications, such as small critical ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
(Show Context)
This paper evaluates the performance of bioinformatics applications on the Cell Broadband Engine recently developed at IBM. In particular we focus on two highly popular bioinformatics applications – FASTA and ClustalW. The characteristics of these bioinformatics applications, such as small critical timeconsuming code size, regular memory accesses, existing vectorized code and embarrassingly parallel computation, make them uniquely suitable for the Cell processing platform. The price and power advantages afforded by the Cell processor also make it an attractive alternative to general purpose processors. We report preliminary performance results for these applications, and contrast these results with the stateoftheart hardware. 1 Computational Biology and High
Serial Computations of Levenshtein Distances
, 1997
"... sequence (LCS) of those strings. If D is the simple Levenshtein distance between two strings having lengths m and n, SES is the length of the shortest edit sequence between the strings, and L is the length of an LCS of the strings, then SES = D and L = (m + n 0D)=2. We will focus on the problem of ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
sequence (LCS) of those strings. If D is the simple Levenshtein distance between two strings having lengths m and n, SES is the length of the shortest edit sequence between the strings, and L is the length of an LCS of the strings, then SES = D and L = (m + n 0D)=2. We will focus on the problem of determining the length of an LCS and also on the related problem of recovering an LCS. Another related problem, which will be discussed in Chapter 7, is that of approximate string matching, in which it is desired to locate all positions within string y which begin an approximation to string x containing at most D errors (insertions or deletions). 124 SERIAL COMPUTATIONS OF LEVENSHTEIN DISTANCES procedure CLASSIC( x,<
A Parallel Wavefront Algorithm for Efficient Biological Sequence Comparison
 In The 2003 International Conference on Computational Science and its Applications
, 2003
"... In this paper we present a parallel wavefront algorithm for computing an alignment between two strings A and C, with A = m, and C = n. On a distributed memory parallel computer of p processors each with O((m + n)/p) memory, the proposed algorithm requires O(p) communication rounds and O(mn/p) lo ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
In this paper we present a parallel wavefront algorithm for computing an alignment between two strings A and C, with A = m, and C = n. On a distributed memory parallel computer of p processors each with O((m + n)/p) memory, the proposed algorithm requires O(p) communication rounds and O(mn/p) local computing time. The novelty of this algorithm is based on a compromise between the workload of each processor and the number of communication rounds required, expressed by a parameter called &alpha;. The proposed algorithm is expressed in terms of this parameter that can be tuned to obtain the best overall parallel time in a given implementation. We show very promising experimental results obtained on a 64node Beowulf machine. A characteristic of the wavefront communication requirement is that each processor communicates with few other processors. This makes it very suitable as a potential application for grid computing.
Efficient Parallel Dynamic Programming
, 1994
"... In 1983, Valiant, Skyum, Berkowitz and Racko# showed that many problems with simple O#n 3 # sequential dynamic programming solutions are in the class NC. They used straight line programs to show that these problems can be solved in O#lg 2 n# time with n 9 processors. In 1988, Rytter used pebbl ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
In 1983, Valiant, Skyum, Berkowitz and Racko# showed that many problems with simple O#n 3 # sequential dynamic programming solutions are in the class NC. They used straight line programs to show that these problems can be solved in O#lg 2 n# time with n 9 processors. In 1988, Rytter used pebbling games to show that these same problems can be solved on a CREW PRAM in O#lg 2 n# time with n 6 =lg n processors. Recently, Huang, Liu and Viswanathan #23# and Galil and Park #15# give algorithms that improve this processor complexityby polylog factors. Using a graph structure that is analogous to the classical dynamic programming table, this paper improves these results. First, this graph characterization leads to a polylog time and n 6 =lg n processor algorithm that solves these problems. Second, there follows a subpolylog time and sublinear processor parallel approximation algorithm for the matrix chain ordering problem. Finally, this paper presents a n 3 =lg n processor and O...
A CGM/BSP Parallel Similarity Algorithm
, 2002
"... We present a CGM/BSP algorithm for computing an alignment (or string editing) between two strings A and C, with jAj = m and jCj = n. The algorithm requires O(p) communication rounds and O( nm p ) local computing time, on a distributed memory parallel computer of p processors each with O(nm=p) ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
We present a CGM/BSP algorithm for computing an alignment (or string editing) between two strings A and C, with jAj = m and jCj = n. The algorithm requires O(p) communication rounds and O( nm p ) local computing time, on a distributed memory parallel computer of p processors each with O(nm=p) memory. We also present implementation results obtained on Beowulf machine with 64 nodes.
Efficient Algorithms for Approximate String Matching with Swaps
 in LNCS 1264, Combinatorial Pattern Matching
, 1999
"... this paper we include the swap operation that interchanges two adjacent characters into the set of allowable edit operations, and we present an O(t min(m, n))time algorithm for the extended edit distance problem, where t is the edit distance between the given strings, and an O(kn)time algorithm ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
this paper we include the swap operation that interchanges two adjacent characters into the set of allowable edit operations, and we present an O(t min(m, n))time algorithm for the extended edit distance problem, where t is the edit distance between the given strings, and an O(kn)time algorithm for the extended kdiffer ences problem. That is, we add swaps into the set of edit operations without increasing the time complexities of previous algorithms that consider only changes, insertions, and deletions for the edit distance and kdifferences problems. # 1999 Academic Press 1. INTRODUCTION Given two strings A[1}}}m] and B[1}}}n] over an alphabet 7, the edit distance between A and<F12
Optimal Speedup on a LowDegree MultiCore Parallel Architecture (LoPRAM)
, 2008
"... Over the last five years, major microprocessor manufacturers have released plans for a rapidly increasing number of cores per microprossesor, with upwards of 64 cores by 2015. In this setting, a sequential RAM computer will no longer accurately reflect the architecture on which algorithms are being ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Over the last five years, major microprocessor manufacturers have released plans for a rapidly increasing number of cores per microprossesor, with upwards of 64 cores by 2015. In this setting, a sequential RAM computer will no longer accurately reflect the architecture on which algorithms are being executed. In this paper we propose a model of low degree parallelism (LoPRAM) which builds upon the RAM and PRAM models yet better reflects recent advances in parallel (multicore) architectures. This model supports a high level of abstraction that simplifies the design and analysis of parallel programs. More importantly we show that in many instances it naturally leads to workoptimal parallel algorithms via simple modifications to sequential algorithms.
Efficient Algorithms for Sequence Analysis
 Proc. Second Workshop on Sequences: Combinatorics, Compression. Securiry
, 1991
"... : We consider new algorithms for the solution of many dynamic programming recurrences for sequence comparison and for RNA secondary structure prediction. The techniques upon which the algorithms are based e#ectively exploit the physical constraints of the problem to derive more e#cient methods f ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
: We consider new algorithms for the solution of many dynamic programming recurrences for sequence comparison and for RNA secondary structure prediction. The techniques upon which the algorithms are based e#ectively exploit the physical constraints of the problem to derive more e#cient methods for sequence analysis. 1. INTRODUCTION In this paper we consider algorithms for two problems in sequence analysis. The first problem is sequence alignment, and the second is the prediction of RNA structure. Although the two problems seem quite di#erent from each other, their solutions share a common structure, which can be expressed as a system of dynamic programming recurrence equations. These equations also can be applied to other problems, including text formatting and data storage optimization. We use a number of well motivated assumptions about the problems in order to provide e#cient algorithms. The primary assumption is that of concavity or convexity. The recurrence relations for bo...
Parallel Longest Common Subsequence using Graphics Hardware
"... We present an algorithm for solving the Longest Common Subsequence problem using graphics hardware acceleration. We identify a parallel memory access pattern which enables us to run efficiently on multiple layers of parallel hardware by matching each layer to the best subalgorithm, which is determi ..."
Abstract
 Add to MetaCart
We present an algorithm for solving the Longest Common Subsequence problem using graphics hardware acceleration. We identify a parallel memory access pattern which enables us to run efficiently on multiple layers of parallel hardware by matching each layer to the best subalgorithm, which is determined using a mix of theoretical and experimental data including knowledge of the specific hardware and memory structure of each layer. We implement a linearspace, cachecoherent algorithm on the CPU, using a twolevel algorithm on the GPU to compute subproblems quickly. The combination of all three running on a CPU/GPU pair is a fast, flexible and scalable solution to the Longest Common Subsequence problem. Our design method is applicable to other algorithms in the Gaussian Elimination Paradigm, and can be generalized to more levels of parallel computation such as GPU clusters. 1.