Results 1 - 10
of
67
Specification Matching of Software Components
- ACM Transactions on Software Engineering and Methodology
, 1996
"... Specification matching is a way to compare two software components based on descriptions of the components' behaviors. In the context of software reuse and library retrieval, it can help determine whether one component can be substituted for another or how one can be modified to fit the requireme ..."
Abstract
-
Cited by 252 (4 self)
- Add to MetaCart
Specification matching is a way to compare two software components based on descriptions of the components' behaviors. In the context of software reuse and library retrieval, it can help determine whether one component can be substituted for another or how one can be modified to fit the requirements of the other. In the context of object-oriented programming, it can help determine when one type is a behavioral subtype of another. We use formal specifications to describe the behavior of software components, and hence, to determine whether two components match. We give precise definitions of not just exact match, but more relevantly, various flavors of relaxed match. These definitions capture the notions of generalization, specialization, and substitutability of software components. Since our formal specifications are pre- and post-conditions written as predicates in firstorder logic, we rely on theorem proving to determine match and mismatch. We give examples from our impleme...
A Language Independent Approach for Detecting Duplicated Code
, 1999
"... Code duplication is one of the factors that severely complicates the maintenance and evolution of large software systems. Techniques for detecting duplicated code exist but rely mostly on parsers, technology that has proven to be brittle in the face of different languages and dialects. In this paper ..."
Abstract
-
Cited by 154 (19 self)
- Add to MetaCart
Code duplication is one of the factors that severely complicates the maintenance and evolution of large software systems. Techniques for detecting duplicated code exist but rely mostly on parsers, technology that has proven to be brittle in the face of different languages and dialects. In this paper we show that is possible to circumvent this hindrance by applying a language independent and visual approach, i.e. a tool that requires no parsing, yet is able to detect a significant amount of code duplication. We validate our approach on a number of case studies, involving four different implementation languages and ranging from 256 K up to 13Mb of source code size. Keywords: Software maintenance, code duplication detection, code visualization 1. Code Duplication Detection Duplicated code is a phenomenon that occurs frequently in large systems. The reasons why programmers duplicate code are manifold (see [9, 2] for a thorough discussion) and include the following reasons: (a) Making a ...
Signature Matching: a Tool for Using Software Libraries
- ACM Transactions on Software Engineering and Methodology
, 1995
"... this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Wright Laboratory or the U. S. Government. The U. S. Government is authorized to reproduce and distribute reprints for Government pu ..."
Abstract
-
Cited by 106 (2 self)
- Add to MetaCart
this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Wright Laboratory or the U. S. Government. The U. S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. This manuscript is submitted for publication with the understanding that the U. S. Government is authorized to reproduce and distribute reprints for Governmental purposes. Authors' address: School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213; email: amy;wing@cs.cmu.edu. To appear, ACM Transactions on Software Engineering and Methodology (TOSEM), April 1995. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. c fl 1995 ACM xxxx-xxxx/xx/xxxx-xxxx $xx.xx 2 \Delta A. Moormann Zaremski and J. M. Wing ware libraries successfully, especially as libraries increase in size, is the availability of good tools to organize, navigate through, and retrieve from libraries. Currently many libraries use the file system for their only organization (directories and files) and file system and editor commands for navigation and retrieval. For example, the local ML library is organized with categories of components as directories (e.g., local/lib/Container/, local/lib/Threads/); one uses ls and
Generating Robust Parsers using Island Grammars
- IN PROCEEDINGS OF THE 8TH WORKING CONFERENCE ON REVERSE ENGINEERING
, 2001
"... Source model extraction---the automated extraction of information from system artifacts---is a common phase in reverse engineering tools. One of the major challenges of this phase is creating extractors that can deal with irregularities in the artifacts that are typical for the reverse engineering d ..."
Abstract
-
Cited by 78 (5 self)
- Add to MetaCart
Source model extraction---the automated extraction of information from system artifacts---is a common phase in reverse engineering tools. One of the major challenges of this phase is creating extractors that can deal with irregularities in the artifacts that are typical for the reverse engineering domain (for example, syntactic errors, incomplete source code, language dialects and embedded languages). This paper
Pattern matching for clone and concept detection
- Journal of Automated Software Engineering
, 1996
"... A legacy system is an operational, large-scale software system that is maintained beyond its first generation of programmers. It typically represents a massive economic investment and is critical to the mission of the organization it serves. As such systems age, they become increasingly complex and ..."
Abstract
-
Cited by 66 (14 self)
- Add to MetaCart
A legacy system is an operational, large-scale software system that is maintained beyond its first generation of programmers. It typically represents a massive economic investment and is critical to the mission of the organization it serves. As such systems age, they become increasingly complex and brittle, and hence harder to maintain. They also become even more critical to the survival of their organization because the business rules encoded within the system are seldom documented elsewhere. Our research is concerned with developing a suite of tools to aid the maintainers of legacy systems in recovering the knowledge embodied within the system. The activities, known collectively as “program understanding”, are essential preludes for several key processes, including maintenance and design recovery for reengineering. In this paper we present three pattern-matching techniques: source code metrics, a dynamic programming algorithm for finding the best alignment between two code fragments, and a statistical matching algorithm between abstract code descriptions represented in an abstract language and actual source code. The methods are applied to detect instances of code cloning in several moderately-sized production systems including tcsh, bash, and CLIPS. The programmer’s skill and experience are essential elements of our approach. Selection of particular tools and analysis methods depends on the needs of the particular task to be accomplished. Integration of the tools provides opportunities for synergy, allowing the programmer to select the most appropriate tool for a given task.
An Empirical Analysis of C Preprocessor Use
- IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 2002
"... This is the first empirical study of the use of the C macro preprocessor, Cpp. To determine how the preprocessor is used in practice, this paper analyzes 26 packages comprising 1.4 million lines of publicly available C code. We determine the incidence of C preprocessor usage---whether in macro def ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
This is the first empirical study of the use of the C macro preprocessor, Cpp. To determine how the preprocessor is used in practice, this paper analyzes 26 packages comprising 1.4 million lines of publicly available C code. We determine the incidence of C preprocessor usage---whether in macro definitions, macro uses, or dependences upon macros---that is complex, potentially problematic, or inexpressible in terms of other C or C++ language features. We taxonomize these various aspects of preprocessor use and particularly note data that are material to the development of tools for C or C++, including translating from C to C++ to reduce preprocessor usage. Our results show that, while most Cpp usage follows fairly simple patterns, an effective program analysis tool must address the preprocessor. The intimate connection between the C programming language and Cpp, and Cpp's unstructured transformations of token streams often hinder both programmer understanding of C programs and tools built to engineer C programs, such as compilers, debuggers, call graph extractors, and translators. Most tools make no attempt to analyze macro usage, but simply preprocess their input, which results in a number of negative consequences; an analysis that takes Cpp into account is preferable, but building such tools requires an understanding of actual usage. Differences between the semantics of Cpp and those of C can lead to subtle bugs stemming from the use of the preprocessor, but there are no previous reports of the prevalence of such errors. Use of C++ can reduce some preprocessor usage, but such usage has not been previously measured. Our data and analyses shed light on these issues and others related to practical understanding or manipulation of real C programs. The results a...
Towards Pattern-Based Design Recovery
"... A method and a corresponding tool is described which assist design recovery and program understanding by recognising instances of design patterns semi-automatically. The approach taken is specifically designed to overcome the existing scalability problems caused by many design and implementation var ..."
Abstract
-
Cited by 49 (7 self)
- Add to MetaCart
A method and a corresponding tool is described which assist design recovery and program understanding by recognising instances of design patterns semi-automatically. The approach taken is specifically designed to overcome the existing scalability problems caused by many design and implementation variants of design pattern instances. Our approach is based on a new recognition algorithm which works incrementally rather than trying to analyse a possibly large software system in one pass without any human intervention. The new algorithm exploits domain and context knowledge given by a reverse engineer and by a special underlying syntax graph. Finally, the paper gives some results and experiences about the application of the approach to the Java AWT and JGL libraries in comparison to other approaches.
Identifying Refactoring Opportunities Using Logic Meta Programming
, 2003
"... In this paper, we show how automated support can be provided for identifying refactoring opportunities, e.g., when an application's design should be refactored and which refactoring (s) in particular should be applied. Such support is achieved by using the technique of logic meta programming to dete ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
In this paper, we show how automated support can be provided for identifying refactoring opportunities, e.g., when an application's design should be refactored and which refactoring (s) in particular should be applied. Such support is achieved by using the technique of logic meta programming to detect so-called bad smells and by defining a framework that uses this information to propose adequate refactorings. We report on some initial but promising experiments that were applied using the proposed techniques.
Core Technologies for System Renovation
, 1996
"... . Renovation of business-critical software is becoming increasingly important. We identify fundamental notions and techniques to aid in system renovation and sketch some basic techniques: generic language technology to build analysis tools, a knowledge retrieval system to aid in program understandin ..."
Abstract
-
Cited by 32 (21 self)
- Add to MetaCart
. Renovation of business-critical software is becoming increasingly important. We identify fundamental notions and techniques to aid in system renovation and sketch some basic techniques: generic language technology to build analysis tools, a knowledge retrieval system to aid in program understanding, and a coordination architecture that is useful to restructure monolithic systems thus enabling their renovation. We argue that these techniques are not only essential for the renovation of old software but that they can also play an important role during the development and maintenance of new software systems. Categories and Subject Description: D.2.6 [Software Engineering]: Programming Environments---Interactive; D.2.7 [Software Engineering]: Distribution and Maintenance ---Restructuring; D.2.m [Software Engineering]: Miscellaneous---Rapid prototyping; D.3.2 [Programming Languages]: Language Classifications---Specialized application languages; E.2 [Data]: Data Storage Representations---C...
A Survey on Software Clone Detection Research
- SCHOOL OF COMPUTING TR 2007-541, QUEEN’S UNIVERSITY
, 2007
"... Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existin ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existing code fragments and using then by pasting with or without minor modifications. One of the major shortcomings of such duplicated fragments is that if a bug is detected in a code fragment, all the other fragments similar to it should be investigated to check the possible existence of the same bug in the similar fragments. Refactoring of the duplicated code is another prime issue in software maintenance although several studies claim that refactoring of certain clones are not desirable and there is a risk of removing them. However, it is also widely agreed that clones should at least be detected. In this paper, we survey the state of the art in clone detection research. First, we describe the clone terms commonly used in the literature along with their corresponding mappings to the commonly used clone types. Second, we provide a review of the existing

