Results 1 
7 of
7
A novel method for multiple alignment of sequences with repeated and shuffled elements
, 2004
"... ..."
In place differential file compression
 THE COMPUTER JOURNAL
, 2005
"... We present algorithms for inplace differential file compression, where a target file T of size n is compressed with respect to a source file S of size m using no additional space in addition to the that used to replace S by T; that is, it is possible to encode using m + n + O(1) space and decode us ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We present algorithms for inplace differential file compression, where a target file T of size n is compressed with respect to a source file S of size m using no additional space in addition to the that used to replace S by T; that is, it is possible to encode using m + n + O(1) space and decode using MAX(m, n) + O(1) space (so that when decoding the source file is overwritten by the decompressed target file). From a theoretical point of view, an optimal solution (best possible compression) to this problem is known to be NPhard, and in previous work we have presented a factor of 4 approximation (not inplace) algorithm based on a sliding window approach. Here we consider practical inplace algorithms based on sliding window compression where our focus is on decoding; that is, although inplace encoding is possible, we will allow O(m + n) space for the encoder so as to improve its speed and present very fast decoding with only MAX(m, n) + O(1) space. Although NPhardness implies that these algorithms cannot always be optimal, the asymptotic optimality of sliding window methods along with their ability for constantfactor approximation is evidence that they should work well for this problem in practice. We introduce the IPSW algorithm (inplace sliding window) and present experiments that indicate that it compares favorably with traditional practical approaches, even those that do not decode inplace, while at the same time having low encoding complexity and extremely low decoding complexity. IPSW is most effective when S and T are reasonably well aligned (most large common substrings occur in approximately the same order). We also present a preprocessing step for string alignment that can be employed when the encoder determines significant gains will be achieved.
Rougemont. Property and equivalence testing on strings
, 2004
"... We investigate property testing and related questions, where instead of the usual Hamming and edit distances between input strings, we consider the more relaxed edit distance with moves. Using a statistical embedding of words which has similarities with the Parikh mapping, we first construct a toler ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We investigate property testing and related questions, where instead of the usual Hamming and edit distances between input strings, we consider the more relaxed edit distance with moves. Using a statistical embedding of words which has similarities with the Parikh mapping, we first construct a tolerant tester for the equality of two words, whose complexity is independent of the string size, and we derive an approximation algorithm for the normalized edit distance with moves. We then consider the question of testing if a string is a member of a given language. We develop a method to compute, in polynomial time in the representation, a geometric approximate description of a regular language by a finite union of polytopes. As an application, we have a new tester for regular languages given by their nondeterministic finite automaton (or regular expressions), whose complexity does not depend on the automaton, except for a polynomial time preprocessing step. Furthermore, this method allows us to compare languages and validates the new notion of equivalent testing that we introduce. Using the geometrical embedding we can distinguish between a pair of automata that compute the same language, and a pair of automata whose languages are not εequivalent in an appropriate sense. Our equivalence tester is deterministic and has polynomial time complexity, whereas the nonapproximated version is PSPACEcomplete. Last, we extend the geometric embedding, and hence the tester algorithms, to infinite regular languages and to contextfree grammars as well. For contextfree grammars the equivalence test has now exponential time complexity, but in comparison, the nonapproximated version is not even recursively decidable. 1
Rougemont. Property and equivalence testing on strings
, 2004
"... Michel de Rougemont £ We investigate property testing and related questions, where instead of the usual Hamming and edit distances between input strings, we consider the more relaxed edit distance with moves. Using a statistical embedding of words which has similarities with the Parikh mapping, we f ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Michel de Rougemont £ We investigate property testing and related questions, where instead of the usual Hamming and edit distances between input strings, we consider the more relaxed edit distance with moves. Using a statistical embedding of words which has similarities with the Parikh mapping, we first construct a tolerant tester for the equality of two words, whose complexity is independent of the string size, and we derive an approximation algorithm for the normalized edit distance with moves. We then consider the question of testing if a string is a member of a given language. We develop a method to compute, in polynomial time in the representation, a geometric approximate description of a regular language by a finite union of polytopes. As an application, we have a new tester for regular languages given by their nondeterministic finite automaton (or regular expressions), whose complexity does not depend on the automaton, except for a polynomial time preprocessing step. Furthermore, this method allows us to compare languages and validates the new notion of equivalent testing that we introduce. Using the geometrical embedding we can distinguish between a pair of automata that compute the same language, and a pair of automata whose languages are not ¤equivalent in an appropriate sense. Our equivalence tester is deterministic and has polynomial time complexity, whereas the nonapproximated version is PSPACEcomplete. Last, we extend the geometric embedding, and hence the tester algorithms, to infinite regular languages and to contextfree grammars as well. For contextfree grammars the equivalence test has now exponential time complexity, but in comparison, the nonapproximated version is not even recursively decidable. 1
Contents lists available at ScienceDirect
"... Information and Computation journal homepage: www.elsevier.com/locate/ic Efficient algorithms for the blockedit problems � ..."
Abstract
 Add to MetaCart
Information and Computation journal homepage: www.elsevier.com/locate/ic Efficient algorithms for the blockedit problems �
Methods
, 2006
"... A novel method for multiple alignment of sequences with repeated and shuffled elements ..."
Abstract
 Add to MetaCart
A novel method for multiple alignment of sequences with repeated and shuffled elements
PARALLELIZATION OF WEIGHTED SEQUENCE COMPARISION BY USING EBWT
"... In this paper, we describe the design of highperformance extended burrow wheeler transform based weighted sequence comparison algorithm for many core GPUs taking advantages of the full programmability offered by compute unified device architecture (CUDA) and its standard library thrust. Our extende ..."
Abstract
 Add to MetaCart
In this paper, we describe the design of highperformance extended burrow wheeler transform based weighted sequence comparison algorithm for many core GPUs taking advantages of the full programmability offered by compute unified device architecture (CUDA) and its standard library thrust. Our extended burrow wheeler transform based weighted sequence comparison algorithm with thrust library implementation on CUDA is the fastest implementation of weighted sequence comparison algorithm than the our previous implementation of extended burrow wheeler transform based weighted sequence algorithm without using thrust library, and it is on average 56.3X times faster. Moreover, our present time implementation is also competitive with CPU implementations, being up to 2.9X times faster than comparable routine on 2.99 GHz Intel Pentium (R) 4 CPU with 3 GB RAM.