Results 11  20
of
49
SECURE OUTSOURCING OF SEQUENCE COMPARISONS
"... Largescale problems in the physical and life sciences are being revolutionized by Internet computing technologies, like grid computing, that make possible the massive cooperative sharing of computational power, bandwidth, storage, and data. A weak computational device, once connected to such a grid ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
Largescale problems in the physical and life sciences are being revolutionized by Internet computing technologies, like grid computing, that make possible the massive cooperative sharing of computational power, bandwidth, storage, and data. A weak computational device, once connected to such a grid, is no longer limited by its slow speed, small amounts of local storage, and limited bandwidth: It can avail itself of the abundance of these resources that is available elsewhere on the network. An impediment to the use of “computational outsourcing” is that the data in question is often sensitive, e.g., of national security importance, or proprietary and containing commercial secrets, or to be kept private for legal requirements such as the HIPAA legislation, GrammLeachBliley, or similar laws. This motivates the design of techniques for computational outsourcing in a privacypreserving manner, i.e., without revealing to the remote agents whose computational power is being used, either one’s data or the outcome of the computation on the data. This paper investigates such secure outsourcing for widely applicable sequence comparison problems, and gives an efficient protocol for a
An effective algorithm for string correction using generalized edit distancesIII. Computational complexity of Xhe algorithm and some app~cations Infor~tion Sci
"... This paper deals with the problem of estimating a transmitted string X, from the corresponding received string Y, which is a noisy version of X,. We assume that Y contains*any number of substitution, insertion, and deletion errors, and that no two consecutive symbols of X, were deleted in transmissi ..."
Abstract

Cited by 18 (10 self)
 Add to MetaCart
This paper deals with the problem of estimating a transmitted string X, from the corresponding received string Y, which is a noisy version of X,. We assume that Y contains*any number of substitution, insertion, and deletion errors, and that no two consecutive symbols of X, were deleted in transmission. We have shown that for channels which cause independent errors, and whose error probabilities exceed those of noisy strings studied in the literature [ 121, at least 99.5 % of the erroneous strings will not contain two consecutive deletion errors. The best estimate X * of X, is defined as that element of H which minimizes the generalized Levenshtein distance D ( X/Y) between X and Y. Using dynamic programming principles, an algorithm is presented which yields X+ without computing individually the distances between every word of H and Y. Though this algorithm requires more memory, it can be shown that it is, in general, computationally less complex than all other existing algorithms which perform the same task. I.
Pattern Recognition of Strings With Substitutions, Insertions, Deletions and Generalized Transpositions
 Pattern Recognition
"... We study the problem of recognizing a string Y which is the noisy version of some unknown string X * chosen from a finite dictionary, H. The traditional case which has been extensively studied in the literature is the one in which Y contains substitution, insertion and deletion (SID) errors. Altho ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
We study the problem of recognizing a string Y which is the noisy version of some unknown string X * chosen from a finite dictionary, H. The traditional case which has been extensively studied in the literature is the one in which Y contains substitution, insertion and deletion (SID) errors. Although some work has been done to extend the traditional set of edit operations to include the straightforward transposition of adjacent characters 2 [14] the problem is unsolved when the transposed characters are themselves subsequently substituted, as is typical in cursive and typewritten script, in molecular biology and in noisy chaincoded boundaries. In this paper we present the first reported solution to the analytic problem of editing one string X to another, Y using these four edit operations. A scheme for obtaining the optimal edit operations has also been given. Both these solutions are optimal for the infinite alphabet case. Using these algorithms we present a syntactic pattern rec...
Algorithms for the constrained longest common subsequence problems
 J. Found. Comput. Sci
, 2005
"... Abstract. Given strings S1,S2, and P, the constrained longest common subsequence problem for S1 and S2 with respect to P is to find a longest common subsequence lcs of S1 and S2 such that P is a subsequence of this lcs. We present an algorithm which improves the time complexity of the problem from t ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Abstract. Given strings S1,S2, and P, the constrained longest common subsequence problem for S1 and S2 with respect to P is to find a longest common subsequence lcs of S1 and S2 such that P is a subsequence of this lcs. We present an algorithm which improves the time complexity of the problem from the previously known O(rn 2 m 2) to O(rnm) where r,n, and m are the lengths of P,S1, and S2, respectively. As a generalization of this, we extend the definition of the problem so that the lcs sought contains a subsequence whose edit distance from P is less than a given parameter d. For the latter problem, we propose an algorithm whose time complexity is O(drnm).
Exploiting symmetry on parallel architectures
, 1995
"... This thesis describes techniques for the design of parallel programs that solvewellstructured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a groupequivariant matrix. Fast techniques for this multiplication are described ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
This thesis describes techniques for the design of parallel programs that solvewellstructured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a groupequivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over nite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetryexploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral groupequivariant matrix is described. This code runs faster than previous serial programs, and discovered a number of results. Second, parallel algorithms for Fourier transforms for nite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct nbody problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
On the safety and efficiency of firewall policy deployment
 Proc. of IEEE Symposium on Security and Privacy
, 2007
"... Firewall policy management is challenging and errorprone. While ample research has led to tools for policy specification, correctness analysis, and optimization, few researchers have paid attention to firewall policy deployment: the process where a management tool edits a firewall’s configuration t ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Firewall policy management is challenging and errorprone. While ample research has led to tools for policy specification, correctness analysis, and optimization, few researchers have paid attention to firewall policy deployment: the process where a management tool edits a firewall’s configuration to make it run the policies specified in the tool. In this paper, we provide the first formal definition and theoretical analysis of safety in firewall policy deployment. We show that naive deployment approaches can easily create a temporary security hole by permitting illegal traffic, or interrupt service by rejecting legal traffic during the deployment. We define safe and mostefficient deployments, and introduce the shuffling theorem as a formal basis for constructing deployment algorithms and proving their safety. We present efficient algorithms for constructing mostefficient deployments in popular policy editing languages. We show that in certain widelyinstalled policy editing languages, a safe deployment is not always possible. We also show how to leverage existing diff algorithms to guarantee a safe, mostefficient, and monotonic deployment in other editing languages. 1
Measuring the Accuracy of PageReading Systems
 PH.D. DISSERTATION, UNLV, LAS VEGAS
, 1996
"... Given a bitmapped image of a page from any document, a pagereading system identifies the characters on the page and stores them in a text file. This “OCRgenerated” text is represented by a string and compared with the correct string to determine the accuracy of this process. The string editing ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Given a bitmapped image of a page from any document, a pagereading system identifies the characters on the page and stores them in a text file. This “OCRgenerated” text is represented by a string and compared with the correct string to determine the accuracy of this process. The string editing problem is applied to find an optimal correspondence of these strings using an appropriate cost function. The ISRI annual test of pagereading systems utilizes the following performance measures, which are defined in terms of this correspondence and the string edit distance: character accuracy, throughput, accuracy by character class, marked character efficiency, word accuracy, nonstopword accuracy, and phrase accuracy. It is shown that the universe of cost functions is divided into equivalence classes, and the cost functions related to the longest common subsequence (LCS) are identified. The computation of a LCS can be made faster by a lineartime preprocessing step.
BitParallel LCSlength Computation Revisited
 In Proc. 15th Australasian Workshop on Combinatorial Algorithms (AWOCA
, 2004
"... The longest common subsequence (LCS) is a classic and wellstudied measure of similarity between two strings A and B. This problem has two variants: determining the length of the LCS (LLCS), and recovering an LCS itself. In this paper we address the first of these two. Let m and n denote the leng ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
The longest common subsequence (LCS) is a classic and wellstudied measure of similarity between two strings A and B. This problem has two variants: determining the length of the LCS (LLCS), and recovering an LCS itself. In this paper we address the first of these two. Let m and n denote the lengths of the strings A and B, respectively, and w denote the computer word size. First we give a slightly improved formula for the bitparallel O(#m/w#n) LLCS algorithm of Crochemore et al. [4]. Then we discuss the relative performance of the bitparallel algorithms and compare our variant against one of the best conventional LLCS algorithms. Finally we propose and evaluate an O(#d/w#n) version of the algorithm, where d is the simple (indel) edit distance between A and B.
Sequence Comparison: Some Theory and Some Practice
, 1988
"... A brief survey of the theory and practice of sequence comparison is made focusing on diff, the UNIX 1 file difference utility. 1 Sequence comparison Sequence comparison is a deep and fascinating subject in Computer Science, both theoretical and practical. However, in our opinion, neither the theo ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
A brief survey of the theory and practice of sequence comparison is made focusing on diff, the UNIX 1 file difference utility. 1 Sequence comparison Sequence comparison is a deep and fascinating subject in Computer Science, both theoretical and practical. However, in our opinion, neither the theoretical nor the practical aspects of the problem are well understood and we feel that their mastery is a true challenge for Computer Science. The central problem can be stated very easily: find an algorithm, as efficient and practical as possible, to compute a longest common subsequence (lcs for short) of two given sequences 2 . As usual, a subsequence of a sequence is another sequence obtained from it by deleting some (not necessarily contiguous) terms. Thus, both en/pri and en/pai are longest common subsequences of sequence/comparison and theory/and/practice. Part of this work was done while the author was visiting the Universit'e de Rouen, in 1987. That visit was partially supported...
Semilocal string comparison: Algorithmic techniques and applications
 Mathematics in Computer Science 1(4) (2008) 571–603 See also arXiv: 0707.3619
"... The longest common subsequence (LCS) problem is a classical problem in computer science. The semilocal LCS problem is a generalisation of the LCS problem, arising naturally in the context of string comparison. In this work, we present a number of algorithmic techniques related to the semilocal LCS ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
The longest common subsequence (LCS) problem is a classical problem in computer science. The semilocal LCS problem is a generalisation of the LCS problem, arising naturally in the context of string comparison. In this work, we present a number of algorithmic techniques related to the semilocal LCS problem, and give a number of algorithmic applications of these techniques. Summarising the presented results, we conclude that semilocal string comparison turns out to be a useful algorithmic plugin, which unifies, and often improves on, a number of previous approaches to various substring and subsequencerelated problems. Contents