Results 11 - 20
of
50
Sifting out the Mud: Low Level C++ Code Reuse
, 2002
"... ... where the available amount of memory is limited. This contrasts with the increasing need for additional functionality and the need for rapid application development. While object-oriented programming languages, providing mechanisms such as inheritance and templates, allow fast development of com ..."
Abstract
-
Cited by 15 (8 self)
- Add to MetaCart
... where the available amount of memory is limited. This contrasts with the increasing need for additional functionality and the need for rapid application development. While object-oriented programming languages, providing mechanisms such as inheritance and templates, allow fast development of complex applications, they have a detrimental effect on program size. This paper introduces new techniques to reuse the code of whole procedures at the binary level and a supporting technique for data reuse. These techniques benefit specifically from program properties originating from the use of templates and inheritance. Together with our previous work on code abstraction at lower levels of granularity, they achieve additional code size reductions of up to 38% on already highly optimized and compacted binaries, without sacrificing execution speed. We have incorporated these techniques in Squeeze++, a prototype link-time binary rewriter for the Alpha architecture, and extensively evaluate them on a suite of 8 real-life C++ applications. The total code size reductions achieved post link-time (i.e. without requiring any change to the compiler) range from 27 to 70%, averaging at around 43%.
Suffix Trees and their Applications in String Algorithms
, 1993
"... : The suffix tree is a compacted trie that stores all suffixes of a given text string. This data structure has been intensively employed in pattern matching on strings and trees, with a wide range of applications, such as molecular biology, data processing, text editing, term rewriting, interpreter ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
: The suffix tree is a compacted trie that stores all suffixes of a given text string. This data structure has been intensively employed in pattern matching on strings and trees, with a wide range of applications, such as molecular biology, data processing, text editing, term rewriting, interpreter design, information retrieval, abstract data types and many others. In this paper, we survey some applications of suffix trees and some algorithmic techniques for their construction. Special emphasis is given to the most recent developments in this area, such as parallel algorithms for suffix tree construction and generalizations of suffix trees to higher dimensions, which are important in multidimensional pattern matching. Work partially supported by the ESPRIT BRA ALCOM II under contract no. 7141 and by the Italian MURST Project "Algoritmi, Modelli di Calcolo e Strutture Informative". y Part of this work was done while the author was visiting AT&T Bell Laboratories. Email: grossi@di.uni...
Improving program efficiency by packing instructions into registers
- In Proceedings of the 2005 ACM/IEEE International Symposium on Computer Architecture
, 2005
"... New processors, both embedded and general purpose, often have conflicting design requirements involving space, power, and performance. Architectural features and compiler optimizations often target one or more design goals at the expense of the others. This paper presents a novel architectural and c ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
New processors, both embedded and general purpose, often have conflicting design requirements involving space, power, and performance. Architectural features and compiler optimizations often target one or more design goals at the expense of the others. This paper presents a novel architectural and compiler approach to simultaneously reduce power requirements, decrease code size, and improve performance by integrating an instruction register file (IRF) into the architecture. Frequently occurring instructions are placed in the IRF. Multiple entries in the IRF can be referenced by a single packed instruction in ROM or L1 instruction cache. Unlike conventional code compression, our approach allows the frequent instructions to be referenced in arbitrary combinations. The experimental results show significant improvements in space and power, as well as some improvement in execution time when using only 32 entries. These advantages make packing instructions into registers an effective approach for improving overall efficiency. 1.
Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach
- SCIENCE OF COMPUTER PROGRAMMING
, 2009
"... Over the last decade many techniques and tools for software clone detection have been proposed. In this paper, we provide a qualitative comparison and evaluation of the current state-of-the-art in clone detection techniques and tools, and organize the large amount of information into a coherent conc ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Over the last decade many techniques and tools for software clone detection have been proposed. In this paper, we provide a qualitative comparison and evaluation of the current state-of-the-art in clone detection techniques and tools, and organize the large amount of information into a coherent conceptual framework. We begin with background concepts, a generic clone detection process and an overall taxonomy of current techniques and tools. We then classify, compare and evaluate the techniques and tools in two different dimensions. First, we classify and compare approaches based on a number of facets, each of which has a set of (possibly overlapping) attributes. Second, we qualitatively evaluate the classified techniques and tools with respect to a taxonomy of editing scenarios designed to model the creation of Type-1, Type-2, Type-3 and Type-4 clones. Finally, we provide examples of how one might use the results of this study to choose the most appropriate clone detection tool or technique in the context of a particular set of goals and constraints. The primary contributions of this paper are: (1) a schema for classifying clone detection techniques and tools and a classification of current clone detectors based on this schema, and (2) a taxonomy of editing scenarios that produce different clone types and a qualitative evaluation of current clone detectors based on this taxonomy.
Combining global code and data compaction
, 2001
"... Computers are increasingly being incorporated in devices with a limited amount ofavailable memory. As a result research is increasingly focusing on the automated reduction of program size. Existing literature focuses on either data or code compaction or on highly language dependent techniques. This ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
Computers are increasingly being incorporated in devices with a limited amount ofavailable memory. As a result research is increasingly focusing on the automated reduction of program size. Existing literature focuses on either data or code compaction or on highly language dependent techniques. This paper shows how combined code and data compaction can be achieved using a link-time code compaction system that reasons about the use of both code and data addresses. The analyses proposed rely only on fundamental properties of linked code and are therefore generally applicable. The combined code and data compaction is implemented in Squeeze, a link-time program compaction system, and evaluated on SPEC2000, MediaBench and C++ programs, resulting in total binary program size reductions of 23.6%{46.6%. This compaction involves no speed tradeo, as the compacted programs are on average about 8% faster.
Instruction Merging and Specialization in the SICStus Prolog Virtual Machine
"... Wanting to improve execution speed and reduce code size of SICStus Prolog programs, we embarked on a project whose aim was to systematically investigate combination and specialization of WAM instructions. Various variants of the SICStus Prolog virtual machine instruction set were designed, implement ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Wanting to improve execution speed and reduce code size of SICStus Prolog programs, we embarked on a project whose aim was to systematically investigate combination and specialization of WAM instructions. Various variants of the SICStus Prolog virtual machine instruction set were designed, implemented, and their performance was evaluated against standard benchmarks and on big Prolog programs. In this paper, we describe our methodology in finding appropriate candicates for instruction merging and specialization, discuss related trade-offs, present detailed statistics and performance measurements that we gathered, and report on our experiences from our involvement in this feat. In short, our experience is positive: the speedup of performing instruction merging and specialization in the context of the SICStus emulator is approximately 10%, while the bytecode size reduction is about 15%.
Reducing instruction fetch cost by packing instructions into register windows
- In Proceedings of the 38th annual ACM/IEEE International Symposium on Microarchitecture (November 2005), IEEE Computer Society
"... Instruction packing is a combination compiler/architectural approach that allows for decreased code size, reduced power consumption and improved performance. The packing is obtained by placing frequently occurring instructions into an Instruction Register File (IRF). Multiple IRF entries can then be ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Instruction packing is a combination compiler/architectural approach that allows for decreased code size, reduced power consumption and improved performance. The packing is obtained by placing frequently occurring instructions into an Instruction Register File (IRF). Multiple IRF entries can then be accessed using special packed instructions. Previous IRF efforts focused on using a single 32-entry register file for the duration of an application. This paper presents software and hardware extensions to the IRF supporting multiple instruction register windows to allow a greater number of relevant instructions to be available for packing in each function. Windows are shared among similar functions to reduce the overall costs involved in such an approach. The results indicate that significant improvements in instruction fetch cost can be obtained by using this simple architectural enhancement. We also show that using an IRF with a loop cache, which is also used to reduce energy consumption, results in much less energy consumption than using either feature in isolation. 1
An Instruction for Direct Interpretation of LZ77-compressed Programs
"... A new instruction adapts LZ77 compression for use inside running programs. The instruction economically references and reuses code fragments that are too small to package as conventional subroutines. The compressed code is interpreted directly, with neither prior nor on-the-fly decompression. Hardwa ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
A new instruction adapts LZ77 compression for use inside running programs. The instruction economically references and reuses code fragments that are too small to package as conventional subroutines. The compressed code is interpreted directly, with neither prior nor on-the-fly decompression. Hardware implementations seem plausible and could benefit both memoryconstrained and more conventional systems. The method is extremely simple. It has been added to a pre-existing, bytecoded instruction set, and it added only ten lines of C to the bytecode interpreter. It typically cuts code size by a third; that is, typical compression ratios are roughly 0.67x. More ambitious compressors are available, but they are more complex, which retards adoption. The current method offers a useful trade-off to these more complex systems.
Code Compaction of Matching Single-Entry Multiple-Exit Regions
- In Proceedings of the 10th Annual International Static Analysis Symposium ( SAS’03
, 2003
"... With the proliferation of embedded devices and systems, there is renewed interest in the generation of compact binaries. Code compaction techniques identify code sequences that repeatedly appear in a program and replace them by a single copy of the recurring sequence. In existing techniques such seq ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
With the proliferation of embedded devices and systems, there is renewed interest in the generation of compact binaries. Code compaction techniques identify code sequences that repeatedly appear in a program and replace them by a single copy of the recurring sequence. In existing techniques such sequences are typically restricted to single-entry single-exit regions in the control ow graph. We have observed that in many applications recurring code sequences form single-entry multiple-exit (SEME) regions. In this paper we propose a generalized algorithm for code compaction that first decomposes a control ow graph into a hierarchy of SEME regions, computes signatures of SEME regions, and then uses the signatures to find pairs of matching SEME regions. Maximal sized matching SEME regions are found and transformed to achieve code compaction. Our transformation is able to compact matching SEME regions whose exits may lead to a combination of identical and differing targets. Our experiments show that this transformation can lead to substantial reduction in code size for many embedded applications.
Decreasing process memory requirements by overlapping program portions
- In Proceedings of the Hawaii International Conference on System Sciences
, 1998
"... Most compiler optimizations focus on saving time and sometimes occur at the expense of increasing size. Yet processor speeds continue to increase at a faster rate than main memory and disk access times. Processors are now frequently being used in embedded systems that often have strict limitations o ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Most compiler optimizations focus on saving time and sometimes occur at the expense of increasing size. Yet processor speeds continue to increase at a faster rate than main memory and disk access times. Processors are now frequently being used in embedded systems that often have strict limitations on the size of programs it can execute. Also, reducing the size of a program may result in improved memory hierarchy performance. This paper describes general techniques for decreasing the memory requirements for a process by automatically overlapping portions of a program. Live range analysis, similar to the analysis used for allocating variables to registers, is used to determine which pro gram portions conflict. Nonconflicting portions are assigned overlapping memory locations. The results show an average decrease of over 10% in process size for a variety of programs with minimal or no dynamic instruction increases. 1.

