Results 1 - 10
of
15
Compiler Transformations for High-Performance Computing
- ACM Computing Surveys
, 1994
"... In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimization for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. ..."
Abstract
-
Cited by 332 (4 self)
- Add to MetaCart
In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimization for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. In contrast, optimization for
Improving Code Density Using Compression Techniques
- Proceedings of the 30th Annual International Symposium on Microarchitecture
, 1997
"... We propose a method for compressing programs in embedded processors where instruction memory size dominates cost. A post-compilation analyzer examines a program and replaces common sequences of instructions with a single instruction codeword. A microprocessor executes the compressed instruction sequ ..."
Abstract
-
Cited by 90 (4 self)
- Add to MetaCart
We propose a method for compressing programs in embedded processors where instruction memory size dominates cost. A post-compilation analyzer examines a program and replaces common sequences of instructions with a single instruction codeword. A microprocessor executes the compressed instruction sequences by fetching codewords from the instruction memory, expanding them back to the original sequence of instructions in the decode stage, and issuing them to the execution stages. We apply our technique to the PowerPC instruction set and achieve 30% to 50% reduction in size for SPEC CINT95 programs. Keywords: Compression, Code Density, Code Space Optimization, Embedded Systems Improving Code Density Using Compression Techniques 1 1 Introduction According to a recent prediction by In-Stat Inc., the merchant processor market is set to exceed $60 billion by 1999, and nearly half of that will be for embedded processors. However, by unit count, embedded processors will exceed the number of g...
Enhanced Code Compression for Embedded RISC Processors
, 1999
"... This paper explores compiler techniques for reducing the memory needed to load and run program executables. In embedded systems, where economic incentives to reduce both ram and rom are strong, the size of compiled code is increasingly important. Similarly, in mobile and network computing, the need ..."
Abstract
-
Cited by 89 (2 self)
- Add to MetaCart
This paper explores compiler techniques for reducing the memory needed to load and run program executables. In embedded systems, where economic incentives to reduce both ram and rom are strong, the size of compiled code is increasingly important. Similarly, in mobile and network computing, the need to transmit an executable before running it places a premium on code size. Our work focuses on reducing the size of a program's code segment, using pattern-matching techniques to identify and coalesce together repeated instruction sequences. In contrast to other methods, our framework preserves the ability to run program executables directly, without an intervening decompression stage. Our compression framework is integrated into an industrial-strength optimizing compiler, which allows us to explore the interaction between code compression and classical code optimization techniques, and requires that we contend with the difficulties of compressing previously optimized code. The specific contributions in this paper include a comprehensive experimental evaluation of code compression for a Risc-like architecture, a more powerful pattern-matching scheme for improved identification of repeated code fragments, and a new form of profile-driven code compression that reduces the speed penalty arising from compression.
Specifying Representations of Machine Instructions
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1997
"... ..."
A DISE Implementation of Dynamic Code Decompression
- In Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES
, 2003
"... Code compression coupled with dynamic decompression is an important technique for both embedded and general-purpose microprocessors. Post-fetch decompression, in which decompression is performed after the compressed instructions have been fetched, allows the instruction cache to store compressed cod ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
Code compression coupled with dynamic decompression is an important technique for both embedded and general-purpose microprocessors. Post-fetch decompression, in which decompression is performed after the compressed instructions have been fetched, allows the instruction cache to store compressed code but requires a highly efficient decompression implementation. We propose implementing post-fetch decompression using dynamic instruction stream editing (DISE), a programmable decoder---similar in structure to those in many IA32 processors---that is used to add functionality to an application by injecting custom code snippets into its fetched instruction stream. A DISE implementation of post-fetch decompression naturally supports customized program-specific decompression dictionaries, enables parameterized decompression allowing similar instruction sequences to share dictionary entries, and uses no decompression-specific hardware. Cycle-level simulation of DISE decompression shows that it can reduce static program size by 35% and execution time by 20%. Parameterized decompression, a feature unique to DISE, accounts for 20% of the code size reduction by making more effective use of the dictionary and allowing PC-relative branches to be included in compressed sequences. DISE-based compression can reduce total energy consumption by 10% and the energy-delay product by as much as 20%. Categories and Subject Descriptors B.3 [Hardware]: Memory Structures
TraceBack: first fault diagnosis by reconstruction of distributed control flow
- In ACM Conference on Programming Language Design and Implementation
, 2005
"... Faults that occur in production systems are the most important faults to fix, but most production systems lack the debugging facilities present in development environments. TraceBack provides debugging information for production systems by providing execution history data about program problems (suc ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Faults that occur in production systems are the most important faults to fix, but most production systems lack the debugging facilities present in development environments. TraceBack provides debugging information for production systems by providing execution history data about program problems (such as crashes, hangs, and exceptions). TraceBack supports features commonly found in production environments such as multiple threads, dynamically loaded modules, multiple source languages (e.g., Java applications running with JNI modules written in C++), and distributed execution across multiple computers. TraceBack supports first fault diagnosis—discovering what went wrong the first time a fault is encountered. The user can see how the program reached the fault state without having to re-run the computation; in effect enabling a limited form of a debugger in production code. TraceBack uses static, binary program analysis to inject lowoverhead runtime instrumentation at control-flow block granularity. Post-facto reconstruction of the records written by the instrumentation code produces a source-statement trace for user diagnosis. The trace shows the dynamic instruction sequence leading up to the fault state, even when the program took exceptions or terminated abruptly (e.g., kill-9). We have implemented TraceBack on a variety of architectures and operating systems, and present examples from a variety of platforms. Performance overhead is variable, from 5% for Apache running SPECweb99, to 16%–25 % for the Java SPECJbb benchmark, to 60 % average for SPECint2000. We show examples of TraceBack’s cross-language and cross-machine abilities, and report its use in diagnosing problems in production software.
DISE: Dynamic instruction stream editing
, 2002
"... Many people deserve thanks for helping me navigate through my PhD. First and foremost, I must thank my wife, Stephanie, for her loving support without which I certainly would not have succeeded. She is a wonderful companion, and I feel like the luckiest man on the planet to be married to her. I than ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Many people deserve thanks for helping me navigate through my PhD. First and foremost, I must thank my wife, Stephanie, for her loving support without which I certainly would not have succeeded. She is a wonderful companion, and I feel like the luckiest man on the planet to be married to her. I thank her for her patience through my many long work days, and for helping me stay sane through my many deadlines. My parents, Art and Nancy, were also extremely supportive throughout my six years in graduate school. I greatly appreciated their loving phone calls, emails, and visits. They have always been there for me. I also must thank, my brother, Ryan, my grandmother, Barbara, as well as Stephanie’s family. Their encouragement and loving support certainly helped me through my PhD. My advisor, E Christopher Lewis, is chiefly responsible for my academic and professional development. I have benefitted profusely from his guidance and support. I learned from E what it means to deeply understand a research problem, and to always consider the broader impact of my research. E is also an incredible teacher, breaking the most complicated concepts down into simple manageable pieces. I will try to emulate these skills
Relocating Machine Instructions by Currying
- ACM SIGPLAN '96 Conference on Programming Language Design and Implementation, in SIGPLAN Notices
, 1996
"... Relocation adjusts machine instructions to account for changes in the locations of the instructions themselves or of external symbols to which they refer. Standard linkers implement a finite set of relocation transformations, suitable for a single architecture. These transformations are enumerated, ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Relocation adjusts machine instructions to account for changes in the locations of the instructions themselves or of external symbols to which they refer. Standard linkers implement a finite set of relocation transformations, suitable for a single architecture. These transformations are enumerated, named, and engraved in a machine-dependent object-file format, and linkers must recognize them by name. These names and their associated transformations are an unnecessary source of machine-dependence. The New Jersey Machine-Code Toolkit is an application generator. It helps programmers create applications that manipulate machine code, including linkers. Guided by a short instruction-set specification, the toolkit generates the bit-manipulating code. Instructions are described by constructors, which denote functions mapping lists of operands to instructions' binary representations. Any operand can be designated as "relocatable," meaning that the operand's value need not be known at the time ...
Potkonjak,“Location Discovery using data-driven statistical error modelling
- IEEE Conference on computer communication
, 2006
"... Abstract—We have developed statistical error modeling techniques for acoustic signal detection-based ranging measurements in the framework of wireless ad-hoc sensor networks (WASNs). The models are used as the basis for solving the location discovery problem in sensor networks. We first demonstrate ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract—We have developed statistical error modeling techniques for acoustic signal detection-based ranging measurements in the framework of wireless ad-hoc sensor networks (WASNs). The models are used as the basis for solving the location discovery problem in sensor networks. We first demonstrate that the major difficulty in location discovery is how to treat errors by proving the location discovery in presence of noisy measurements is a NP-complete problem, even in onedimensional space. Consequently, we formulate the location discovery as an instance of nonlinear function minimization that optimizes each of the empirically derived statistical error models. The minimization problem is then solved using a conjugate gradient-based nonlinear function optimization solver. We validate the efficiency of the approach by conducting comprehensive experiments on both deployed and simulated WASNs. The results indicate that the statistical model-based approach significantly improves the location accuracy compared with the approaches using the traditional optimization objectives. In addition, the localized version of our location discovery algorithm is capable of finding competitive solutions using significantly lower communication cost. Keywords-Statistical error modeling; Location discovery I.

