Results 1 - 10
of
31
NICAD: Accurate Detection of Near-Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization
, 2008
"... This paper examines the effectiveness of a new languagespecific parser-based but lightweight clone detection approach. Exploiting a novel application of a source transformation system, the method accurately finds near-miss clones using an efficient text line comparison technique. The transformation ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
This paper examines the effectiveness of a new languagespecific parser-based but lightweight clone detection approach. Exploiting a novel application of a source transformation system, the method accurately finds near-miss clones using an efficient text line comparison technique. The transformation system assists the method in three ways. First, using agile parsing it provides user-specified flexible pretty- printing to remove noise, standardize formatting and break program statements into parts such that potential changes can be detected as simple linewise text differences. Second, it provides efficient flexible extraction of potential clones to be compared using island grammars and agile parsing to select granularities and enumerate potential clones. Third, using transformation rules it provides flexible code normalization to allow for local editing differences between similar code segments and filtering out of uninteresting parts of potential clones. In this paper we introduce the theory and practice of the framework and demonstrate its use in finding function clones in C code. Early experiments indicate that the method is capable of finding near-miss clones with high precision and recall, and with reasonable performance.
Do Code Clones Matter?
"... Code cloning is not only assumed to inflate maintenance costs but also considered defect-prone as inconsistent changes to code duplicates can lead to unexpected behavior. Consequently, the identification of duplicated code, clone detection, has been a very active area of research in recent years. Up ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Code cloning is not only assumed to inflate maintenance costs but also considered defect-prone as inconsistent changes to code duplicates can lead to unexpected behavior. Consequently, the identification of duplicated code, clone detection, has been a very active area of research in recent years. Up to now, however, no substantial investigation of the consequences of code cloning on program correctness has been carried out. To remedy this shortcoming, this paper presents the results of a large-scale case study that was undertaken to find out if inconsistent changes to cloned code can indicate faults. For the analyzed commercial and open source systems we not only found that inconsistent changes to clones are very frequent but also identified a significant number of faults induced by such changes. The clone detection
Duplicate code detection using anti-unification
"... Abstract—This paper describes a new algorithm for finding software clones. It is conceptually independent of the source language of the analyzed programs, working at the level of abstract syntax trees. The algorithm considers that two sequences of statements form a clone if one of them can be obtain ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract—This paper describes a new algorithm for finding software clones. It is conceptually independent of the source language of the analyzed programs, working at the level of abstract syntax trees. The algorithm considers that two sequences of statements form a clone if one of them can be obtained from the other by replacing some subtrees. To our knowledge this notion was not previously employed in the literature. It allows to take into account all information on the syntactic structure of a program. We have implemented this algorithm in the tool Clone Digger. It currently supports the Python and Java languages. Clone Digger is free and provided under the GPL license. I.
Analyzing the discipline of preprocessor annotations in 30 million lines of c code
- In Proceeding of the 10th International Conference on Aspect Oriented Software Development (AOSD’11
, 2011
"... The C preprocessor cpp is a widely used tool for implementing variable software. It enables programmers to express variable code (which may even crosscut the entire implementation) with conditional compilation. The C preprocessor relies on simple text processing and is independent of the host langua ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The C preprocessor cpp is a widely used tool for implementing variable software. It enables programmers to express variable code (which may even crosscut the entire implementation) with conditional compilation. The C preprocessor relies on simple text processing and is independent of the host language (C, C++, Java, and so on). Languageindependent text processing is powerful and expressive— programmers can make all kinds of annotations in the form of #ifdefs—but can render unpreprocessed code difficult to process automatically by tools, such as refactoring, concern management, and variability-aware type checking. We distinguish between disciplined annotations, which align with the underlying source-code structure, and undisciplined annotations, which do not align with the structure and hence complicate tool development. This distinction raises the question of how frequently programmers use undisciplined annotations and whether it is feasible to change them to disciplined annotations to simplify tool development and to enable programmers to use a wide variety of tools in the first place. By means of an analysis of 40 medium-sized to large-sized C programs, we show empirically that programmers use cpp mostly in a disciplined way: about 84% of all annotations respect the underlying source-code structure. Furthermore, we analyze the remaining undisciplined annotations, identify patterns, and discuss how to transform them into a disciplined form.
Similar Code Detection and Elimination for Erlang Programs
- Twelfth International Symposium on Practical Aspects of Declarative Languages
, 2010
"... Abstract. A well-known bad code smell in refactoring and software maintenance is duplicated code, that is the existence of code clones, which are code fragments that are identical or similar to one another. Unjustified code clones increase code size, make maintenance and comprehension more difficult ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
Abstract. A well-known bad code smell in refactoring and software maintenance is duplicated code, that is the existence of code clones, which are code fragments that are identical or similar to one another. Unjustified code clones increase code size, make maintenance and comprehension more difficult, and also indicate design problems such as a lack of encapsulation or abstraction. This paper describes an approach to detecting ‘similar ’ code based on the notion of anti-unification, or least-general common abstraction. This mechanism is used for detecting code clones in Erlang programs, and is supplemented by a collection of refactorings to support user-controlled automatic clone removal. The similar code detection algorithm and refactorings are integrated within Wrangler, a tool developed at the University of Kent for interactive refactoring of Erlang programs. We conclude with a report on case studies and comparisons with other tools.
CloneDetective – A workbench for clone detection research
- In ICSE’09
, 2009
"... The area of clone detection has considerably evolved over the last decade, leading to approaches with better results, but at the same time using more elaborate algorithms and tool chains. In our opinion a level has been reached, where the initial investment required to setup a clone detection tool c ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
The area of clone detection has considerably evolved over the last decade, leading to approaches with better results, but at the same time using more elaborate algorithms and tool chains. In our opinion a level has been reached, where the initial investment required to setup a clone detection tool chain and the code infrastructure required for experimenting with new heuristics and algorithms seriously hampers the exploration of novel solutions or specific case studies. As a solution, this paper presents CloneDetective, an open source framework and tool chain for clone detection, which is especially geared towards configurability and extendability and thus supports the preparation and conduction of clone detection research. 1. Customizable Clone-Detection
Towards a Refactoring Guideline Using Code Clone Classification
"... Evolution of software often decreases desired properties like readability and maintainability of the evolved code. The process of refactoring aims at increasing the same desired properties by restructuring the code. New paradigms like AOP allow aspect-oriented refactorings as counterparts of object- ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Evolution of software often decreases desired properties like readability and maintainability of the evolved code. The process of refactoring aims at increasing the same desired properties by restructuring the code. New paradigms like AOP allow aspect-oriented refactorings as counterparts of object-oriented refactoring with the same aim. However, it is not obvious to the user, when to use which paradigm for achieving certain goals. In this paper we present an approach of code clone classification, which advises the developer when to use a respective refactoring technique or concept.
A Mutation / Injection-based Automatic Framework for Evaluating Clone Detection Tools
- In Mutation’09
, 2009
"... In recent years many methods and tools for software clone detection have been proposed. While some work has been done on assessing and comparing performance of these tools, very little empirical evaluation has been done. In particular, accuracy measures such as precision and recall have only been ro ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
In recent years many methods and tools for software clone detection have been proposed. While some work has been done on assessing and comparing performance of these tools, very little empirical evaluation has been done. In particular, accuracy measures such as precision and recall have only been roughly estimated, due both to problems in creating a validated clone benchmark against which tools can be compared, and to the manual effort required to hand check large numbers of candidate clones. In this paper we propose an automated method for empirically evaluating clone detection tools that leverages mutation-based techniques to overcome these limitations by automatically synthesizing large numbers of known clones based on an editing theory of clone creation. Our framework is effective in measuring recall and precision of clone detection tools for various types of fine-grained clones in real systems without manual intervention. 1.
An Empirical Study of Function Clones in Open Source Software
, 2008
"... The new hybrid clone detection tool NICAD combines the strengths and overcomes the limitations of both textbased and AST-based clone detection techniques to yield highly accurate identification of cloned code in software systems. In this paper, we present a first empirical study of function clones i ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The new hybrid clone detection tool NICAD combines the strengths and overcomes the limitations of both textbased and AST-based clone detection techniques to yield highly accurate identification of cloned code in software systems. In this paper, we present a first empirical study of function clones in open source software using NICAD. We examine more than 15 open source C and Java systems, including the entire Linux Kernel and Apache
Towards a Mutation-Based Automatic Framework for Evaluating Code Clone Detection Tools
, 2008
"... In the last decade, a great many code clone detection tools have been proposed. Such a large number of tools calls for a quantitative comparison, and there have been several attempts to empirically evaluate and compare many of the state-of-the-art tools. However, a recent study shows that there are ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In the last decade, a great many code clone detection tools have been proposed. Such a large number of tools calls for a quantitative comparison, and there have been several attempts to empirically evaluate and compare many of the state-of-the-art tools. However, a recent study shows that there are several factors that could influence the the validity of the results of such comparisons. In order to overcome the effects of such factors (at least in part), in this student poster paper we outline a mutation-based controlled framework for evaluating clone detection tools using edit-based mutation operators that model cloning actions. While the framework is not yet completely implemented and as yet we do not have experimental data, we anticipate that such a framework will provide a useful contribution to the community by providing a more solid objective foundation for tool evaluation.

