Results 1 - 10
of
12
Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach
- SCIENCE OF COMPUTER PROGRAMMING
, 2009
"... Over the last decade many techniques and tools for software clone detection have been proposed. In this paper, we provide a qualitative comparison and evaluation of the current state-of-the-art in clone detection techniques and tools, and organize the large amount of information into a coherent conc ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Over the last decade many techniques and tools for software clone detection have been proposed. In this paper, we provide a qualitative comparison and evaluation of the current state-of-the-art in clone detection techniques and tools, and organize the large amount of information into a coherent conceptual framework. We begin with background concepts, a generic clone detection process and an overall taxonomy of current techniques and tools. We then classify, compare and evaluate the techniques and tools in two different dimensions. First, we classify and compare approaches based on a number of facets, each of which has a set of (possibly overlapping) attributes. Second, we qualitatively evaluate the classified techniques and tools with respect to a taxonomy of editing scenarios designed to model the creation of Type-1, Type-2, Type-3 and Type-4 clones. Finally, we provide examples of how one might use the results of this study to choose the most appropriate clone detection tool or technique in the context of a particular set of goals and constraints. The primary contributions of this paper are: (1) a schema for classifying clone detection techniques and tools and a classification of current clone detectors based on this schema, and (2) a taxonomy of editing scenarios that produce different clone types and a qualitative evaluation of current clone detectors based on this taxonomy.
Refactoring for Parameterizing Java Classes
"... Type safety and expressiveness of many existing Java libraries and their client applications would improve, if the libraries were upgraded to define generic classes. Efficient and accurate tools exist to assist client applications to use generic libraries, but so far the libraries themselves must be ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
Type safety and expressiveness of many existing Java libraries and their client applications would improve, if the libraries were upgraded to define generic classes. Efficient and accurate tools exist to assist client applications to use generic libraries, but so far the libraries themselves must be parameterized manually, which is a tedious, time-consuming, and error-prone task. We present a typeconstraint-based algorithm for converting non-generic libraries to add type parameters. The algorithm handles the full Java language and preserves backward compatibility, thus making it safe for existing clients. Among other features, it is capable of inferring wildcard types and introducing type parameters for mutually-dependent classes. We have implemented the algorithm as a fully automatic refactoring in Eclipse. We evaluated our work in two ways. First, our tool parameterized code that was lacking type parameters. We contacted the developers of several of these applications, and in all cases they confirmed that the resulting parameterizations were correct and useful. Second, to better quantify its effectiveness, our tool parameterized classes from already-generic libraries, and we compared the results to those that were created by the libraries ’ authors. Our tool performed the refactoring accurately—in 87 % of cases the results were as good as those created manually by a human expert, in 9 % of cases the tool results were better, and in 4 % of cases the tool results were worse.
On the side-effects of code abstraction
- In Proceedings of the 2003 ACM SIGPLAN Conference on Languages, Compilers and Tools for Embedded Systems
, 2003
"... More and more devices contain computers with limited amounts of memory. As a result, code compaction techniques are gaining popularity, especially when they also improve performance and power consumption, or at least not degrade it. This paper quantifies the side-effects of code abstraction on perfo ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
More and more devices contain computers with limited amounts of memory. As a result, code compaction techniques are gaining popularity, especially when they also improve performance and power consumption, or at least not degrade it. This paper quantifies the side-effects of code abstraction on performance using extensive measurements and simulations on the SPECint2000 benchmark suite and some additional C++ programs. We show how to use profile information in order to obtain almost all the code size reduction benefits of code abstraction, yet experience almost none of its disadvantages. Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors—code generation;compilers;optimization; E.4 [Coding and Information
Bosschere. Automated reduction of the memory footprint of the linux kernel
- Trans. on Embedded Computing Sys
"... The limited built-in configurability of Linux can lead to expensive code size overhead when it is used in the embedded market. To overcome this problem, we propose the application of link-time compaction and specialization techniques that exploit the a priori known, fixed runtime environment of many ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The limited built-in configurability of Linux can lead to expensive code size overhead when it is used in the embedded market. To overcome this problem, we propose the application of link-time compaction and specialization techniques that exploit the a priori known, fixed runtime environment of many embedded systems. In experimental setups based on the ARM XScale and i386 platforms, the proposed techniques are able to reduce the kernel memory footprint with over 16%. We also show how relatively simple additions to existing binary rewriters can implement the proposed techniques for a complex, very unconventional program, such as the Linux kernel. We note that even after specialization, a lot of seemingly unnecessary code remains in the kernel and propose to reduce the footprint of this code by applying code-compression techniques. This technique, combined with the previous ones, reduces the memory footprint with over 23% for the i386 platform and 28 % for the ARM platform. Finally, we pinpoint an important code size growth problem when compaction and compression techniques are combined on the ARM platform.
Link-time binary rewriting techniques for program compaction
- ACM Transactions on Programming Languages and Systems
, 2005
"... Small program size is an important requirement for embedded systems with limited amounts of memory. We describe how link-time compaction through binary rewriting can achieve code size reductions of up to 62 % for statically bound languages such as C, C++, and Fortran, without compromising on perform ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Small program size is an important requirement for embedded systems with limited amounts of memory. We describe how link-time compaction through binary rewriting can achieve code size reductions of up to 62 % for statically bound languages such as C, C++, and Fortran, without compromising on performance. We demonstrate how the limited amount of information about a program at link time can be exploited to overcome overhead resulting from separate compilation. This is done with scalable, cost-effective, whole-program analyses, optimizations, and duplicate code and data elimination techniques. The discussed techniques are evaluated and their cost-effectiveness is quantified with SQUEEZE++, a prototype link-time compactor. Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors—Code generation; compilers; optimization; E.4 [Coding and Information Theory]: Data compaction and compression
Steganography for Executables and Code Transformation Signatures
- In Proc. 7th International Conference on Information Security and Cryptology
, 2005
"... Steganography embeds a secret message in an innocuous cover-object. This paper identifies three cover-specific redundancies of executable programs and presents steganographic techniques to exploit these redundancies. A general framework to evaluate the stealth of the proposed techniques is intro ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Steganography embeds a secret message in an innocuous cover-object. This paper identifies three cover-specific redundancies of executable programs and presents steganographic techniques to exploit these redundancies. A general framework to evaluate the stealth of the proposed techniques is introduced and applied on an implementation for the IA-32 architecture. This evaluation proves that, whereas existing tools such as Hydan [1] are insecure, significant encoding rates can in fact be achieved at a high security level.
Link-time compaction and optimization of ARM executables
, 2007
"... The overhead in terms of code size, power consumption, and execution time caused by the use of precompiled libraries and separate compilation is often unacceptable in the embedded world, where real-time constraints, battery life-time, and production costs are of critical importance. In this paper, w ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The overhead in terms of code size, power consumption, and execution time caused by the use of precompiled libraries and separate compilation is often unacceptable in the embedded world, where real-time constraints, battery life-time, and production costs are of critical importance. In this paper, we present our link-time optimizer for the ARM architecture. We discuss how we can deal with the peculiarities of the ARM architecture related to its visible program counter and how the introduced overhead can to a large extent be eliminated. Our link-time optimizer is evaluated with four tool chains, two proprietary ones from ARM and two open ones based on GNU GCC. When used with proprietary tool chains from ARM Ltd., our link-time optimizer achieved average code size reductions of 16.0 and 18.5%, while the programs have become 12.8 and 12.3 % faster, and 10.7 to 10.1 % more energy efficient. Finally, we show how the incorporation of link-time optimization in tool chains may influence library interface design.
A Dictionary Construction Technique for Code Compression Systems with Echo Instructions
"... Dictionary compression mechanisms identify redundant sequences of instructions that occur in a program. The sequences are extracted and copied to a dictionary. Each sequence is then replaced with a codeword that acts as an index into the dictionary, thereby enabling decompression of the program at r ..."
Abstract
- Add to MetaCart
Dictionary compression mechanisms identify redundant sequences of instructions that occur in a program. The sequences are extracted and copied to a dictionary. Each sequence is then replaced with a codeword that acts as an index into the dictionary, thereby enabling decompression of the program at runtime. The problem of optimally organizing a dictionary consisting solely of redundant sequences in order to maximize compression has long been known to be NP-Complete [23]. This paper addresses the problem of dictionary construction when redundant code fragments are represented as Data Flow Graphs (DFGs) rather than linear sequences of instructions. Since there are generally multiple legal schedules for a given DFG G, a compiler must determine a schedule for G so that other DFGs that are subgraphs of G can reference some substring of G’s final code sequence. This reduces the size of the dictionary, and in turn, the size of the compressed program. Our experiments with 10 MediaBench [18] applications yielded reductions in dictionary size ranging from 21.14 % to 29.76 % compared to a naïve approach.
Link-time Optimization of a Linux Kernel for Space
, 2003
"... Introduction As opposed to general purpose computing systems, where execution speed is the first and often only concern of the designer, embedded systems usually demand a much greater emphasis on low power usage and low memory and storage space usage. This is due to the more stringent design constr ..."
Abstract
- Add to MetaCart
Introduction As opposed to general purpose computing systems, where execution speed is the first and often only concern of the designer, embedded systems usually demand a much greater emphasis on low power usage and low memory and storage space usage. This is due to the more stringent design constraints of embedded systems: they are often battery powered, featuring only a small amount of storage and memory space. Over the last decade, a lot of research has been done into addressing the memory and storage space usage problem. Techniques have been developed that allow for the static analysis and transformation of whole programs. These techniques, typically applied at link time, can be used to generate more compact programs [1, 2, 3, 4]. Our approach is to complement these developments with the compaction of the one part of the system that has until now been overlooked: the operating system kernel. The standard whole-program optimization techniques can also be applied to an operating s
Instruction Set Limitation in Support of Software Diversity
"... Abstract. This paper proposes a novel technique, called instruction set limitation, to strengthen the resilience of software diversification against collusion attacks. Such attacks require a tool to match corresponding program fragments in different, diversified program versions. The proposed techni ..."
Abstract
- Add to MetaCart
Abstract. This paper proposes a novel technique, called instruction set limitation, to strengthen the resilience of software diversification against collusion attacks. Such attacks require a tool to match corresponding program fragments in different, diversified program versions. The proposed technique limits the types of instructions occurring in a program to the most frequently occurring types, by replacing the infrequently used types as much as possible by more frequently used ones. As such, this technique, when combined with diversification techniques, reduces the number of easily matched code fragments. The proposed technique is evaluated against a powerful diversification tool for Intel’s x86 and an optimized matching process on a number of SPEC 2006 benchmarks. Key words: diversity, binary rewriting, code fragment matching, software protection 1

