Results 1 - 10
of
12
Supporting Program Comprehension Using Semantic and Structural Information
, 2001
"... The paper focuses on investigating the combined use of semantic and structural information of programs to support the comprehension tasks involved in the maintenance and reengineering of software systems. Here, semantic refers to the domain specific issues (both problem and development domains) of a ..."
Abstract
-
Cited by 50 (13 self)
- Add to MetaCart
The paper focuses on investigating the combined use of semantic and structural information of programs to support the comprehension tasks involved in the maintenance and reengineering of software systems. Here, semantic refers to the domain specific issues (both problem and development domains) of a software system. The other dimension, structural, refers to issues such as the actual syntactic structure of the program along with the control and data flow that it represents. An advanced information retrieval method, latent semantic indexing, is used to define a semantic similarity measure between software components. Components within a software system are then clustered together using this similarity measure. Simple structural information (.e., file organization) of the software system is then used to assess the semantic cohesion of the clusters and files, with respect to each other. The measures are formally defined for general application. A set of experiments is presented which demonstrates how these measures can assist in the understanding of a nontrivial software system, namely a version of NCSA Mosaic.
Identification of High-Level Concept Clones in Source Code
, 2001
"... Source code duplication occurs frequently within large software systems. Pieces of source code, functions, and data types are often duplicated in part, or in whole, for a variety of reasons. Programmers may simply be reusing a piece of code via copy and paste or they may be "reinventing the wheel". ..."
Abstract
-
Cited by 46 (9 self)
- Add to MetaCart
Source code duplication occurs frequently within large software systems. Pieces of source code, functions, and data types are often duplicated in part, or in whole, for a variety of reasons. Programmers may simply be reusing a piece of code via copy and paste or they may be "reinventing the wheel".
Clustering Software Artifacts Based on Frequent Common Changes
- In Proc. IWPC
, 2005
"... Changes of software systems are less expensive and less error-prone if they affect only one subsystem. Thus, clusters of artifacts that are frequently changed together are subsystem candidates. We introduce a two-step method for identifying such clusters. First, a model of common changes of software ..."
Abstract
-
Cited by 39 (9 self)
- Add to MetaCart
Changes of software systems are less expensive and less error-prone if they affect only one subsystem. Thus, clusters of artifacts that are frequently changed together are subsystem candidates. We introduce a two-step method for identifying such clusters. First, a model of common changes of software artifacts, called co-change graph, is extracted from the version control repository of the software system. Second, a layout of the co-change graph is computed that reveals clusters of frequently co-changed artifacts. We derive requirements for such layouts, and introduce an energy model for producing layouts that fulfill these requirements. We evaluate the method by applying it to three example systems, and comparing the resulting layouts to authoritative decompositions.
Equipping the Reflexion Method with Automated Clustering
- Working Conference on Reverse Engineering
, 2005
"... A significant aspect in applying the Reflexion Method is the mapping of components found in the source code onto the conceptual components defined in the hypothesized architecture. To date, this mapping is established manually, which requires a lot of work for large software systems. In this paper, ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
A significant aspect in applying the Reflexion Method is the mapping of components found in the source code onto the conceptual components defined in the hypothesized architecture. To date, this mapping is established manually, which requires a lot of work for large software systems. In this paper, we present a new approach, in which clustering techniques are applied to support the user in the mapping activity. The result is a semi-automated mapping technique that accommodates the automatic clustering of the source model with the user’s hypothesized knowledge about the system’s architecture. This paper describes also a case study in which our semi-automated mapping technique has been applied successfully to extend a partial map of a real-world software application. 1
An Incremental Semi-Automatic Method for Component Recovery
- In Working Conference on Reverse Engineering
, 1999
"... Atomic components are sets of related variables, types, and subprograms, e.g., abstract data types and objects. Many techniques exist to detect them automatically. However, as an evaluation has shown, none of them has the precision needed [9]. One approach to achieve a higher precision is to integra ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Atomic components are sets of related variables, types, and subprograms, e.g., abstract data types and objects. Many techniques exist to detect them automatically. However, as an evaluation has shown, none of them has the precision needed [9]. One approach to achieve a higher precision is to integrate the user into the detection cycle. This paper describes a method in which computer and human work together to find atomic components. Furthermore, it discusses how the techniques can be enhanced to work incrementally, which is needed if they are to be integrated with this method. Moreover, it proposes ways of combining the techniques within this interactive method. 1. Introduction Architecture recovery comprises detection of components (the computational parts) and connectors (the means and points of communication) of systems. The most primitive components consist of subprograms, types, and global variables. Groupings of these kinds of declarations are, for example, objects, abstract d...
Mining Co-Change Clusters from Version Repositories
, 2005
"... Clusters of software artifacts that are frequently changed together are subsystem candidates, because one of the main goals of software design is to make changes local. The contribution of this paper is a visualization-based method that supports the identification of such clusters. First, we define ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Clusters of software artifacts that are frequently changed together are subsystem candidates, because one of the main goals of software design is to make changes local. The contribution of this paper is a visualization-based method that supports the identification of such clusters. First, we define the co-change graph as a simple but powerful model of common changes of software artifacts, and describe how to extract the graph from version control repositories. Second, we introduce an energy model for computing force-directed layouts of co-change graphs. The resulting layouts have a well-defined interpretation in terms of the structure of the visualized graph, and clearly reveal groups of frequently co-changed artifacts. We evaluate our method by comparing the layouts for three example projects with authoritative subsystem decompositions.
Coupling and Cohesion as Modularization Drivers: Are we being over-persuaded?
- PROCEEDINGS OF THE 5TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR’2001)
, 2001
"... For around three decades Software Engineering gurus have "sold " us the ideal of minimal coupling and maximal cohesion at all levels of abstraction as a way to reduce the effort to understand and maintain software systems. The object-oriented paradigm brought a new design philosophy and encapsulatio ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
For around three decades Software Engineering gurus have "sold " us the ideal of minimal coupling and maximal cohesion at all levels of abstraction as a way to reduce the effort to understand and maintain software systems. The object-oriented paradigm brought a new design philosophy and encapsulation mechanisms that apparently would help us to achieve that desideratum. However, after a decade where this paradigm has emerged as the dominant one, we are faced with practitioners ’ reality: coupling and cohesion do not seem to be the dominant driving forces when it comes to modularization. This conclusion was based on a relatively large sample of heterogeneous systems. We describe an environment that allows not only assessing this reality but also deriving better modularization solutions in what concerns coupling and cohesion. These solutions are generated by means of cluster analysis techniques and partially preserve the original modularization criteria. We believe this approach can be of great help in reengineering actions of object-oriented legacy systems.
Software architecture recovery for distributed systems
-
, 1999
"... The design and evaluation of appropriate software architectures is key to the eective development, management, evolution, and reuse of software systems. However, current software engineering practice has generally led to architectural designs that are informal, ad hoc, and dicult to analyse and main ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The design and evaluation of appropriate software architectures is key to the eective development, management, evolution, and reuse of software systems. However, current software engineering practice has generally led to architectural designs that are informal, ad hoc, and dicult to analyse and maintain. One consequence is that most existing systems have little or no documented architectural information, and the information that does exist is often an inaccurate representation of the implemented architecture. All too often, architectural information about an unfamiliar system needs to be extracted directly from the implemented software artifacts. This is a very demanding process commonly referred to as architecture recovery. Although architecture recovery can be signicantly facilitated with the help of current reverse engineering techniques and tools, many issues remain to be properly addressed, particularly regarding recovery of the runtime abstractions (client, servers, communication protocols, etc.) that are typical to the design of distributed systems. This dissertation presents a static reverse engineering approach, called X-ray, to
A Comparison of Abstract Data Type and Objects Recovery Techniques
- JOURNAL SCIENCE OF COMPUTER PROGRAMMING
, 2000
"... In the context of the authors' research on architectural features recovery, abstract data types (ADT) and abstract data objects (ADO, also called objects) have been identified as two of the smallest components which are useful for building a significant architectural overview of the system. The auth ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
In the context of the authors' research on architectural features recovery, abstract data types (ADT) and abstract data objects (ADO, also called objects) have been identified as two of the smallest components which are useful for building a significant architectural overview of the system. The authors have named these the atomic components (AC) of an architecture.
This article compares six published techniques which extract ADTs and ADOs from source code without extensive data flow analysis. A prototype tool implementing each technique has been developed and applied to three medium-size systems written in C (each over 30 Kloc). The results from each approach are compared with the atomic components identified by hand by a group of software engineers.
This article extends previous papers by discussing how the software engineers' AC identification was validated and by analyzing the false positives, i.e., the atomic components identified by automatic approaches which were not identified by software engineers.
A Concept Formation Based Approach to Object Identification
- IN PROCEDURAL CODE. AUTOMATED SOFTWARE ENGINEERING, 6:387–410
, 1999
"... Legacy software systems present a high level of entropy combined with imprecise documentation. This makes their maintenance more difficult, more time consuming, and costlier. In order to address these issues, many organizations have been migrating their legacy systems to emerging technologies. In t ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Legacy software systems present a high level of entropy combined with imprecise documentation. This makes their maintenance more difficult, more time consuming, and costlier. In order to address these issues, many organizations have been migrating their legacy systems to emerging technologies. In this paper, we describe a computer-supported approach aimed at supporting the migration of procedural software systems to the object-oriented (OO) technology. Our approach is based on the automatic formation of concepts, and uses information extracted directly from code to identify objects. The approach tends, thus, to minimize the need for domain application experts.

