Results 1 - 10
of
26
Representing Concerns in Source Code
, 2003
"... Many program evolution tasks involve source code that is not modularized as a single unit. Furthermore, the source code relevant to a change task often implements different concerns, or high-level concepts that a developer must consider. Finding and understanding concerns scattered in source code is ..."
Abstract
-
Cited by 33 (6 self)
- Add to MetaCart
Many program evolution tasks involve source code that is not modularized as a single unit. Furthermore, the source code relevant to a change task often implements different concerns, or high-level concepts that a developer must consider. Finding and understanding concerns scattered in source code is a difficult task that accounts for a large proportion of the effort of performing program evolution. One possibility to mitigate this problem is to produce textual documentation that describes scattered concerns. However, this approach is impractical because it is costly, and because, as a program evolves, the documentation becomes inconsistent with the source code. The thesis of this dissertation is that a description of concerns, representing program structures and linked to source code, that can be produced cost-effectively during program investigation activities, can help developers perform software evolution tasks more systematically, and on different versions of a system. To validate the claims of this thesis, we have developed a model for a structure, called concern graph, that describes concerns in source code in terms of relations between program elements. The model also defines precisely the notion of inconsistency between a concern graph and the
Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification
- in Proceedings of 14th IEEE International Conference on Program Comprehension (ICPC'06
, 2006
"... The paper recasts the problem of feature location in source code as a decision-making problem in the presence of uncertainty. The main contribution consists in the combination of two existing techniques for feature location in source code. Both techniques provide a set of ranked facts from the softw ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
The paper recasts the problem of feature location in source code as a decision-making problem in the presence of uncertainty. The main contribution consists in the combination of two existing techniques for feature location in source code. Both techniques provide a set of ranked facts from the software, as result to the feature identification problem. One of the techniques is based on a Scenario Based Probabilistic ranking of events observed while executing a program under given scenarios. The other technique is defined as an information retrieval task, based on the Latent Semantic Indexing of the source code. We show the viability and effectiveness of the combined technique with two case studies. A first case study is a replication of feature identification in Mozilla, which allows us to directly compare the results with previously published data. The other case study is a bug location problem in Mozilla. The results show that the combined technique improves feature identification significantly with respect to each technique used independently. * 1.
Dynamic feature traces: Finding features in unfamiliar code
- In Proceedings of the 21st IEEE International Conference on Software Maintenance. 337
"... as conforming ..."
Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval
- IEEE Trans. Software Eng
, 2007
"... Abstract—This paper recasts the problem of feature location in source code as a decision-making problem in the presence of uncertainty. The solution to the problem is formulated as a combination of the opinions of different experts. The experts in this work are two existing techniques for feature lo ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Abstract—This paper recasts the problem of feature location in source code as a decision-making problem in the presence of uncertainty. The solution to the problem is formulated as a combination of the opinions of different experts. The experts in this work are two existing techniques for feature location: a scenario-based probabilistic ranking of events and an information retrieval-based technique that uses latent semantic indexing. The combination of these two experts is empirically evaluated through several case studies, which use the source code of the Mozilla Web browser and the Eclipse integrated development environment. The results show that the combination of experts significantly improves the effectiveness of feature location when compared to each of the experts used independently. Index Terms—program understanding, feature identification, concept location, dynamic and static analyses, information retrieval, Latent Semantic Indexing, scenario-based probabilistic ranking, open source software.
Exploring the neighborhood with Dora to expedite software maintenance
- In 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE/ACM
, 2007
"... Completing software maintenance and evolution tasks for today’s large, complex software systems can be difficult, often requiring considerable time to understand the system well enough to make correct changes. Despite evidence that successful programmers use program structure as well as identifier n ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Completing software maintenance and evolution tasks for today’s large, complex software systems can be difficult, often requiring considerable time to understand the system well enough to make correct changes. Despite evidence that successful programmers use program structure as well as identifier names to explore software, most existing program exploration techniques use either structural or lexical identifier information. By using only one type of information, automated tools ignore valuable clues about a developer’s intentions—clues critical to the human program comprehension process. In this paper, we present and evaluate a technique that exploits both program structure and lexical information to help programmers more effectively explore programs. Our approach uses structural information to focus automated program exploration and lexical information to prune irrelevant structure edges from consideration. For the important program exploration step of expanding from a seed, our experimental results demonstrate that an integrated lexical- and structural-based approach is significantly more effective than a state-of-the-art structural program exploration technique.
IRiSS - A Source Code Exploration Tool
- in Industrial and Tool Proceedings of 21st IEEE International Conference on Software Maintenance (ICSM'05
, 2005
"... IRiSS (Information Retrieval based Software Search) is a software exploration tool that uses an indexing engine based on an information retrieval method. IRiSS is implemented as an add-in to the Visual Studio.NET development environment and it allows the user to search a C++ project for the implemen ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
IRiSS (Information Retrieval based Software Search) is a software exploration tool that uses an indexing engine based on an information retrieval method. IRiSS is implemented as an add-in to the Visual Studio.NET development environment and it allows the user to search a C++ project for the implementation of concepts formulated as natural language queries. The results of the query are presented as ranked list of software methods or classes, ordered by the similarity to the user query. A second component of IRiSS provides another searching method based on regular expression matching. This method is based on the existing “find” feature form the Visual Studio environment and it has an improved format for the display of the search results. 1.
Feature Location via Information Retrieval based Filtering of a Single Scenario Execution Trace
- in Automated Software Engineering (ASE 2007
, 2007
"... The paper presents a semi-automated technique for feature location in source code. The technique is based on combining information from two different sources: an execution trace, on one hand and the comments and identifiers from the source code, on the other hand. Users execute a single partial scen ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The paper presents a semi-automated technique for feature location in source code. The technique is based on combining information from two different sources: an execution trace, on one hand and the comments and identifiers from the source code, on the other hand. Users execute a single partial scenario, which exercises the desired feature and all executed methods are identified based on the collected trace. The source code is indexed using Latent Semantic Indexing, an Information Retrieval method, which allows users to write queries relevant to the desired feature and rank all the executed methods based on their textual similarity to the query. Two case studies on open source software (JEdit and Eclipse) indicate that the new technique has high accuracy, comparable with previously published approaches and it is easy to use as it considerably simplifies the dynamic analysis.
Topology analysis of software dependencies
- ACM Transactions on Software Engineering and Methodology
"... Before performing a modification task, a developer usually has to investigate the source code of a system to understand how to carry out the task. Discovering the code relevant to a change task is costly because it is a human activity whose success depends on a large number of unpredictable factors, ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Before performing a modification task, a developer usually has to investigate the source code of a system to understand how to carry out the task. Discovering the code relevant to a change task is costly because it is a human activity whose success depends on a large number of unpredictable factors, such as intuition and luck. Although studies have shown that effective developers tend to explore a program by following structural dependencies, no methodology is available to guide their navigation through the thousands of dependency paths found in a nontrivial program. We describe a technique to automatically propose and rank program elements that are potentially interesting to a developer investigating source code. Our technique is based on an analysis of the topology of structural dependencies in a program. It takes as input a set of program elements of interest to a developer and produces a fuzzy set describing other elements of potential interest. Empirical evaluation of our technique indicates that it can help developers quickly select program elements worthy of investigation while avoiding less interesting ones.
Inferring Structural Patterns for Concern Traceability in Evolving Software
- ASE'07
, 2007
"... As part of the evolution of software systems, effort is often invested to discover in what parts of the source code a feature (or other concern) is implemented. Unfortunately, knowledge about a concern’s implementation can become invalid as the system evolves. We propose to mitigate this problem by ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
As part of the evolution of software systems, effort is often invested to discover in what parts of the source code a feature (or other concern) is implemented. Unfortunately, knowledge about a concern’s implementation can become invalid as the system evolves. We propose to mitigate this problem by automatically inferring structural patterns among the elements identified as relevant to a concern’s implementation. We then document the inferred patterns as rules that can be checked as the source code evolves. Checking whether structural patterns hold across different versions of a system enables the automatic identification of new elements related to a documented concern. We implemented our technique for Java in an Eclipse plug-in called ISIS4J and applied it to a number of concerns. With a case study spanning 34 versions of the development history of an open-source system, we show how our approach supports the tracking of a concern’s implementation through modifications such as extensions and refactorings.
Mining Business Topics in Source Code using Latent Dirichlet Allocation ABSTRACT
"... One of the difficulties in maintaining a large software system is the absence of documented business domain topics and correlation between these domain topics and source code. Without such a correlation, people without any prior application knowledge would find it hard to comprehend the functionalit ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
One of the difficulties in maintaining a large software system is the absence of documented business domain topics and correlation between these domain topics and source code. Without such a correlation, people without any prior application knowledge would find it hard to comprehend the functionality of the system. Latent Dirichlet Allocation (LDA), a statistical model, has emerged as a popular technique for discovering topics in large text document corpus. But its applicability in extracting business domain topics from source code has not been explored so far. This paper investigates LDA in the context of comprehending large software systems and proposes a human assisted approach based on LDA for extracting domain topics from source code. This method has been applied on a number of open source and proprietary systems. Preliminary results indicate that LDA is able to identify some of the domain topics and is a satisfactory starting point for further manual refinement of topics.

