Results 1 - 10
of
138
An O(ND) Difference Algorithm and Its Variations
- Algorithmica
, 1986
"... The problems of finding a longest common subsequence of two sequences A and B and a shortest edit script for transforming A into B have long been known to be dual problems. In this paper, they are shown to be equivalent to finding a shortest/longest path in an edit graph. Using this perspective, a s ..."
Abstract
-
Cited by 133 (4 self)
- Add to MetaCart
The problems of finding a longest common subsequence of two sequences A and B and a shortest edit script for transforming A into B have long been known to be dual problems. In this paper, they are shown to be equivalent to finding a shortest/longest path in an edit graph. Using this perspective, a simple O(ND) time and space algorithm is developed where N is the sum of the lengths of A and B and D is the size of the minimum edit script for A and B. The algorithm performs well when differences are small (sequences are similar) and is consequently fast in typical applications. The algorithm is shown to have O(N +D expected-time performance under a basic stochastic model. A refinement of the algorithm requires only O(N) space, and the use of suffix trees leads to an O(NlgN +D ) time variation.
Extracting Usability Information from User Interface Events
- ACM Computing Surveys
, 1999
"... Modern window-based user interface systems generate user interface events as natural products of their normal operation. Because such events can be automatically captured and because they indicate user behavior with respect to an application's user interface, they have long been regarded as a potent ..."
Abstract
-
Cited by 93 (6 self)
- Add to MetaCart
Modern window-based user interface systems generate user interface events as natural products of their normal operation. Because such events can be automatically captured and because they indicate user behavior with respect to an application's user interface, they have long been regarded as a potentially fruitful source of information regarding application usage and usability. However, because user interface events are typically voluminos and rich in detail, automated support is generally required to extract information at a level of abstraction that is useful to investigators interested in analyzing application usage or evaluating usability. This survey examines computer-aided techniques used by HCI practitioners and researchers to extract usability-related information from user interface events. A framework is presented to help HCI practitioners and researchers categorize and compare the approaches that have been, or might fruitfully be, applied to this problem. Because many of the techniques in the research literature have not been evaluated in practice, this survey provides a conceptual evaluation to help identify some of the relative merits and drawbacks of the various classes of approaches. Ideas for future research in this area are also presented. This survey addresses the following questions: How might user interface events be used in evaluating usability? How are user interface events related to other forms of usability data? What are the key challenges faced by investigators wishing to exploit this data? What approaches have been brought to bear on this problem and how do they compare to one another? What are some of the important open research questions in this area?
Identifying Syntactic Differences Between Two Programs
- Software - Practice and Experience
, 1991
"... this paper is organized into five sections, as follows. The internal form of a program, which is a variant of a parse tree, is discussed in the next section. Then the tree-matching algorithm and the synchronous pretty-printing technique are described. Experience with the comparator for the C languag ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
this paper is organized into five sections, as follows. The internal form of a program, which is a variant of a parse tree, is discussed in the next section. Then the tree-matching algorithm and the synchronous pretty-printing technique are described. Experience with the comparator for the C language and some performance measurements are also presented. The last section discusses related work and concludes this paper
A file comparison program
- Software: Practice and Experience
, 1985
"... This paper presents a simple method for computing a shortest sequence of insertion and deletion commands that converts one given file to another. The method is particularly efficient when the difference between the two files is small compared to the files ' lengths. In experiments performed on typic ..."
Abstract
-
Cited by 51 (3 self)
- Add to MetaCart
This paper presents a simple method for computing a shortest sequence of insertion and deletion commands that converts one given file to another. The method is particularly efficient when the difference between the two files is small compared to the files ' lengths. In experiments performed on typical files, the program often ran four times faster than the UNIX diff command. KEY WORDS Edit distance Edit script Filc comparison
A Memory-Efficient Dynamic Programming Algorithm for Optimal Alignment of a Sequence to an RNA Secondary Structure
, 2002
"... Background: Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N³) in memory. This is only practical for small RNAs. Re ..."
Abstract
-
Cited by 51 (1 self)
- Add to MetaCart
Background: Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N³) in memory. This is only practical for small RNAs. Results:...
A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices
, 2002
"... The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n 2 ) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations. The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n 2 = log n) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually O(hn 2 = log n) where h 1 is the entropy of the text. Institut Gaspard-Monge, Universite de Marne-la-Vallee, Cite Descartes, Champs-surMarne, 77454 Marne-la-Vallee Cedex 2, France, email: mac@univ-mlv.fr. y Department of Computer Science, Haifa University, Haifa 31905, Israel, phone: (972-4) 824-0103, FAX: (972-4) 824-9331; Department of Computer and Information Science, Polytechnic University, Six MetroTech Center, Brooklyn, NY 11201-3840; email: landau@poly.edu; partially supported by NSF grant CCR-0104307, by NATO Science Programme grant PST.CLG.977017, by the Israel Science Foundation (grants 173/98 and 282/01), by the FIRST Foundation of the Israel Academy of Science and Humanities, and by IBM Faculty Partnership Award. z Department of Computer Science, Haifa University, Haifa 31905, Israel; On Education Leave from the IBM T.J.W. Research Center; email: michal@cs.haifa.il; partially supported by by the Israel Science Foundation (grants 173/98 and 282/01), and by the FIRST Foundation of the Israel Academy of Science ...
Selecting Tests and Identifying Test Coverage Requirements for Modified Software
- In Proceedings of the 1994 International Symposium on Software Testing and Analysis (ISSTA 94
, 1994
"... Regression testing is performed on modified software to provide confidence that changed and affected portions of the code behave correctly. We present an approach to regression testing that handles two important tasks: selecting tests from the existing test suite that should be rerun, and identifyin ..."
Abstract
-
Cited by 45 (21 self)
- Add to MetaCart
Regression testing is performed on modified software to provide confidence that changed and affected portions of the code behave correctly. We present an approach to regression testing that handles two important tasks: selecting tests from the existing test suite that should be rerun, and identifying portions of the code that must be covered by tests. Both tasks are performed by traversing graphs for the program and its modified version. We first apply our technique to single procedures and then show how our technique is applied at the interprocedural level. Our approach has several advantages over previous work. First, our test selection technique is safe, selecting every test that may produce different output in the modified program. However, our selection technique chooses smaller test sets than other safe approaches. Second, our approach is the first safe approach to identify coverage requirements, and the first safe approach to do so interprocedurally. Third, our approach handles ...
Divide-and-conquer frontier search applied to optimal sequence alignment
- In National Conference on Artificial Intelligence (AAAI
, 2000
"... We present a new algorithm that reduces the space complexity of heuristic search. It is most e ective for problem spaces that grow polynomially with problem size, but contain large numbers of short cycles. For example, the problem of nding an optimal global alignment ofseveral DNA or amino-acid sequ ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
We present a new algorithm that reduces the space complexity of heuristic search. It is most e ective for problem spaces that grow polynomially with problem size, but contain large numbers of short cycles. For example, the problem of nding an optimal global alignment ofseveral DNA or amino-acid sequences can be solved by nding a lowest-cost corner-to-corner path in a d-dimensional grid. A previous algorithm, called divide-and-conquer bidirectional search (Korf 1999), saves memory by storing only the Open lists and not the Closed lists. We show that this idea can be applied in a unidirectional search aswell. This extends the technique to problems where bidirectional search is not applicable, and is more e cient in both time and space than the bidirectional version. If n is the length of the strings, and d is the number of strings, this algorithm can reduce the memory requirement from O(n d) to O(n d;1). While our current implementation of DCFS is somewhat slower than existing dynamic programming approaches for optimal alignment of multiple gene sequences, DCFS is a more general algorithm 1
Reconstructing a History of Recombinations From a Set of Sequences
- Discrete Appl. Math
, 1998
"... One of the classic problems in computational biology is the reconstruction of evolutionary history. A recent trend in the area is to increase the explanatory power of the models that are considered by incorporating higher-order evolutionary events that more accurately reflect the mechanisms of mutat ..."
Abstract
-
Cited by 35 (6 self)
- Add to MetaCart
One of the classic problems in computational biology is the reconstruction of evolutionary history. A recent trend in the area is to increase the explanatory power of the models that are considered by incorporating higher-order evolutionary events that more accurately reflect the mechanisms of mutation at the level of the chromosome. We take a step in this direction by considering the problem of reconstructing an evolutionary history for a set of genetic sequences that have evolved by recombination. Recombination is a non-tree-like event that produces a child sequence by crossing two parent sequences. We present polynomial-time algorithms for reconstructing a parsimonious history of such events for several models of recombination when all sequences, including those of ancestors, are present in the input. We also show that these models appear to be near the limit of what can be solved in polynomial time, in that several natural generalizations are NP-complete. Keywords Computational bio...
A Survey on Software Clone Detection Research
- SCHOOL OF COMPUTING TR 2007-541, QUEEN’S UNIVERSITY
, 2007
"... Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existin ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existing code fragments and using then by pasting with or without minor modifications. One of the major shortcomings of such duplicated fragments is that if a bug is detected in a code fragment, all the other fragments similar to it should be investigated to check the possible existence of the same bug in the similar fragments. Refactoring of the duplicated code is another prime issue in software maintenance although several studies claim that refactoring of certain clones are not desirable and there is a risk of removing them. However, it is also widely agreed that clones should at least be detected. In this paper, we survey the state of the art in clone detection research. First, we describe the clone terms commonly used in the literature along with their corresponding mappings to the commonly used clone types. Second, we provide a review of the existing

