Results 1 - 10
of
52
Using Automatic Clustering to Produce High-Level System Organizations of Source Code
- In Proc. 6th Intl. Workshop on Program Comprehension
, 1998
"... This paper describesacollection of algorithms that we developed and implemented to facilitate the automatic recovery of the modular structure of a software system from its sourcecode. ..."
Abstract
-
Cited by 103 (24 self)
- Add to MetaCart
This paper describesacollection of algorithms that we developed and implemented to facilitate the automatic recovery of the modular structure of a software system from its sourcecode.
Bunch: A clustering tool for the recovery and maintenance of software system structures
- In Proceedings; IEEE International Conference on Software Maintenance
, 1999
"... Software systems are typically modi ed inorder to extend or change their functionality, improve their performance, port them to di erent platforms, and so on. For developers, it is crucial to understand the structure of a system before attempting to modify it. The structure of a system, however, may ..."
Abstract
-
Cited by 80 (17 self)
- Add to MetaCart
Software systems are typically modi ed inorder to extend or change their functionality, improve their performance, port them to di erent platforms, and so on. For developers, it is crucial to understand the structure of a system before attempting to modify it. The structure of a system, however, may not be apparent to new developers, because the design documentation is non-existent or, worse, inconsistent with the implementation. This problem could be alleviated if developers were somehow able to produce high-level system decomposition descriptions from the low-level structures present in the source code. We have developed a clustering tool called Bunch that creates a system decomposition automatically by treating clustering as an optimization problem. This paper describes the extensions made to Bunch in response to feedback we received from users. The most important extension, in terms of the quality of results and execution e ciency, is a feature that enables the integration of designer knowledge about the system structure into an otherwise fully automatic clustering process. We use a case study to show how our new features simpli ed the task of extracting the subsystem structure ofamedium size program, while exposing an interesting design aw in the process.
Identifying objects using cluster and concept analysis
- In 21st International Conference on Software Engineering, ICSE-99
, 1999
"... and their applications. SMC is sponsored by the Netherlands Organization for Scientific Research (NWO). CWI is a member of ..."
Abstract
-
Cited by 77 (14 self)
- Add to MetaCart
and their applications. SMC is sponsored by the Netherlands Organization for Scientific Research (NWO). CWI is a member of
Supporting Program Comprehension Using Semantic and Structural Information
, 2001
"... The paper focuses on investigating the combined use of semantic and structural information of programs to support the comprehension tasks involved in the maintenance and reengineering of software systems. Here, semantic refers to the domain specific issues (both problem and development domains) of a ..."
Abstract
-
Cited by 50 (13 self)
- Add to MetaCart
The paper focuses on investigating the combined use of semantic and structural information of programs to support the comprehension tasks involved in the maintenance and reengineering of software systems. Here, semantic refers to the domain specific issues (both problem and development domains) of a software system. The other dimension, structural, refers to issues such as the actual syntactic structure of the program along with the control and data flow that it represents. An advanced information retrieval method, latent semantic indexing, is used to define a semantic similarity measure between software components. Components within a software system are then clustered together using this similarity measure. Simple structural information (.e., file organization) of the software system is then used to assess the semantic cohesion of the clusters and files, with respect to each other. The measures are formally defined for general application. A set of experiments is presented which demonstrates how these measures can assist in the understanding of a nontrivial software system, namely a version of NCSA Mosaic.
Automatic Clustering of Software Systems using a Genetic Algorithm
- In Proceedings of Software Technology and Engineering Practice
, 1998
"... Large software systems tend to have a rich and complex structure. Designers typically depict the structure of software systems as one or more directed graphs. For example, a directed graph can be used to describe the modules (or classes) of a system and their static interrelationships using nodes an ..."
Abstract
-
Cited by 47 (15 self)
- Add to MetaCart
Large software systems tend to have a rich and complex structure. Designers typically depict the structure of software systems as one or more directed graphs. For example, a directed graph can be used to describe the modules (or classes) of a system and their static interrelationships using nodes and directed edges, respectively. We call such graphs module dependency graphs (MDGs). MDGs can be large and complex graphs. One way of making them more accessible is to partition them, separating their nodes (i.e., modules) into clusters (i.e., subsystems). In this paper, we describe a technique for finding "good" MDG partitions. Good partitions feature relatively independent subsystems that contain modules which are highly inter-dependent. Our technique treats finding a good partition as an optimization problem, and uses a Genetic Algorithm (GA) to search the extraordinarily large solution space of all possible MDG partitions. The effectiveness of our technique is demonstrated by applying it...
Identification of High-Level Concept Clones in Source Code
, 2001
"... Source code duplication occurs frequently within large software systems. Pieces of source code, functions, and data types are often duplicated in part, or in whole, for a variety of reasons. Programmers may simply be reusing a piece of code via copy and paste or they may be "reinventing the wheel". ..."
Abstract
-
Cited by 46 (9 self)
- Add to MetaCart
Source code duplication occurs frequently within large software systems. Pieces of source code, functions, and data types are often duplicated in part, or in whole, for a variety of reasons. Programmers may simply be reusing a piece of code via copy and paste or they may be "reinventing the wheel".
ACDC : An Algorithm for Comprehension-Driven Clustering
- In Proceedings of the Seventh Working Conference on Reverse Engineering
, 2000
"... The software clustering literature contains many different approaches that attempt to automatically decompose software systems. These approaches commonly utilize criteria or measures based on principles such as high cohesion and low coupling, information hiding etc. In this paper, we present an alg ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
The software clustering literature contains many different approaches that attempt to automatically decompose software systems. These approaches commonly utilize criteria or measures based on principles such as high cohesion and low coupling, information hiding etc. In this paper, we present an algorithm that subscribes to a philosophy targeted towards program comprehension and based on subsystem patterns. We discuss the algorithm's implementation and describe experiments that demonstrate its usefulness. 1 Introduction A common approach that researchers in various disciplines use in order to deal with large data sets is to develop a taxonomy, i.e. create categories of objects that exhibit similar features or properties. Such categories (commonly referred to as clusters) can be discovered through a variety of techniques that have been proposed in the literature. Research on the effectiveness and behaviour of these techniques has given rise to the field of cluster analysis. Software ...
Software botryology: Automatic clustering of software systems
- In Proceedings of the International Workshop on Large-Scale Software Composition
, 1998
"... It has long been recognized that the decomposition of a large software system into \meaningful " subsystems is essential for both the development and maintenance phases of a software project. We introduce the term Software Botryology 1 for the area ofresearch that attempts to automatically clus ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
It has long been recognized that the decomposition of a large software system into \meaningful " subsystems is essential for both the development and maintenance phases of a software project. We introduce the term Software Botryology 1 for the area ofresearch that attempts to automatically cluster a software system. In this paper, we survey approaches to the clustering problem from researchers in the software engineering community. We also present clustering techniques used in other disciplines, and argue that their utilization in a software context could lead to better solutions to the software clustering problem. Finally, we outline research challenges and open problems of interest. 1
A scalable approach to user-session based testing of web applications through concept analysis
- In Proceedings of the Automated Software Engineering Conference
, 2004
"... The continuous use of the web for daily operations by businesses, consumers, and government has created a great demand for reliable web applications. One promising approach to testing the functionality of web applications leverages user session data collected by web servers. This approach automatica ..."
Abstract
-
Cited by 24 (13 self)
- Add to MetaCart
The continuous use of the web for daily operations by businesses, consumers, and government has created a great demand for reliable web applications. One promising approach to testing the functionality of web applications leverages user session data collected by web servers. This approach automatically generates test cases based on real user profiles. The key contribution of this paper is the application of concept analysis for clustering user sessions for test suite reduction. Existing incremental concept analysis algorithms can be exploited to avoid collecting large user session data sets and thus provide scalability. We have completely automated the process from user session collection and reduction through replay. Our incremental test suite update algorithm coupled with our experimental study indicate that concept analysis provides a promising means for incrementally updating reduced test suites in response to newly captured user sessions, with some loss in fault detection capability and practically no coverage loss. 1.
Architectural Design Recovery using Data Mining Techniques
, 2000
"... This paper presents a technique for recovering the high level design of legacy software systems according to user defined architectural plans. Architectural plans are represented using a description language and specify system components and their interfaces. Such descriptions are viewed as queries ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
This paper presents a technique for recovering the high level design of legacy software systems according to user defined architectural plans. Architectural plans are represented using a description language and specify system components and their interfaces. Such descriptions are viewed as queries that are applied on a large data base which stores information extracted from the source code of the subject legacy system. Data mining techniques and a modified branch and bound search algorithm are used to control the matching process, by which the query is satisfied and query variables are instantiated. The matching process allows the alternative results to be ranked according to data mining associations and clustering techniques and, finally, be presented to the user. 1 Introduction Software maintenance constitutes a major part of the software life-cycle. Most maintenance tasks require a decomposition of the legacy system into modules and functional units. One approach to architectura...

