Results 1 -
7 of
7
Register Allocation via Graph Coloring
, 1992
"... Chaitin and his colleagues at IBM in Yorktown Heights built the first global register allocator based on graph coloring. This thesis describes a series of improvements and extensions to the Yorktown allocator. There are four primary results: Optimistic coloring Chaitin's coloring heuristic pessimis ..."
Abstract
-
Cited by 133 (4 self)
- Add to MetaCart
Chaitin and his colleagues at IBM in Yorktown Heights built the first global register allocator based on graph coloring. This thesis describes a series of improvements and extensions to the Yorktown allocator. There are four primary results: Optimistic coloring Chaitin's coloring heuristic pessimistically assumes any node of high degree will not be colored and must therefore be spilled. By optimistically assuming that nodes of high degree will receive colors, I often achieve lower spill costs and faster code; my results are never worse. Coloring pairs The pessimism of Chaitin's coloring heuristic is emphasized when trying to color register pairs. My heuristic handles pairs as a natural consequence of its optimism. Rematerialization Chaitin et al. introduced the idea of rematerialization to avoid the expense of spilling and reloading certain simple values. By propagating rematerialization information around the SSA graph using a simple variation of Wegman and Zadeck's constant propag...
A Universal Technique for Fast and Flexible Instruction-Set Architecture Simulation
, 2002
"... In the last decade, instruction-set simulators have become an essential development tool for the design of new programmable architectures. Consequently, the simulator performance is a key factor for the overall design efficiency. Based on the extremely poor performance of commonly used interpretive ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
In the last decade, instruction-set simulators have become an essential development tool for the design of new programmable architectures. Consequently, the simulator performance is a key factor for the overall design efficiency. Based on the extremely poor performance of commonly used interpretive simulators, research work on fast compiled instruction-set simulation was started ten years ago. However, due to the restrictiveness of the compiled technique, it has not been able to push through in commercial products. This paper presents a new retargetable simulation technique which combines the performance of traditional compiled simulators with the flexibility of interpretive simulation. This technique is not limited to any class of architectures or applications and can be utilized from architecture exploration up to end-user software development. The work-flow and the applicability of the so-called just-intime cache compiled simulation (JIT-CCS) technique will be demonstrated by means of state of the art real world architectures.
The Many Faces of Introspection
, 1992
"... Introspection or the ability to observe one's own behavior is one of the most powerful capabilities of human intelligence; it is the basis for understanding and improvement of one's behavior and of human progress. Similarly, introspective computer systems, introduced in this thesis, examine, reason ..."
Abstract
-
Cited by 14 (9 self)
- Add to MetaCart
Introspection or the ability to observe one's own behavior is one of the most powerful capabilities of human intelligence; it is the basis for understanding and improvement of one's behavior and of human progress. Similarly, introspective computer systems, introduced in this thesis, examine, reason about, and change their own behavior in powerful new ways. Because the complexity of computers is rapidly increasing, yet is restricted by limited human resources, the most attractive quality of introspective computers is their ability to manage this growing complexity themselves. Self-managing computer systems would greatly expand the rational power and complexity of computer systems that can be successfully built. The main difficulty in constructing introspective computer systems is enabling the system to obtain a description of its complete behavior in a dynamic and unobtrusive way. This thesis proposes the partition of the system into two threads of control. The first thread performs the...
Partial Translation
, 1993
"... Traditional simulation of a target architecture by interpreting object code can be improved by translating the object code to an intermediate format. This approach is called interpretive translation. Despite a substantial performance improvement over traditional interpretation, a large part of the o ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Traditional simulation of a target architecture by interpreting object code can be improved by translating the object code to an intermediate format. This approach is called interpretive translation. Despite a substantial performance improvement over traditional interpretation, a large part of the overhead is unnecessary. An alternative approach is block translation, where one or more simulated instructions are translated to directly executable code. This approach has several drawbacks. We discuss the problems with block translation, analyse the overhead of interpretive translation, and describe a hybrid approach---partial translation---that combines the benefits of both approaches. Partial translation implements an intermediate format that supports the addition of run-time generated code whenever appropriate. The performance limit (slowdown) of interpetive translation is around 15, and real implementations have achieved 20-30. Partial translation will perform considerably better. Fi...
Ultra fast cycleaccurate compiled emulation of inorder pipelined architectures
- SAMOS 2005, LNCS 3553
, 2005
"... Emulation of one architecture on another is useful when the architecture is under design, when software must be ported to a new platform or is being developed for systems which are still under development, or for embedded systems that have insufficient resources to support the software development p ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Emulation of one architecture on another is useful when the architecture is under design, when software must be ported to a new platform or is being developed for systems which are still under development, or for embedded systems that have insufficient resources to support the software development process. Emulation using an interpreter is typically slower than normal execution by up to 3 orders of magnitude. Our approach instead translates the program from the original architecture to another architecture while faithfully preserving its semantics at the lowest level. The emulation speeds are comparable to, and often faster than, programs running on the original architecture. Partial evaluation of architectural features is used to achieve such impressive performance, while permitting accurate statistics collection. Accuracy is at the level of the number of clock cycles spent executing each instruction (hence the description cycle-accurate). Key words: instruction set emulator, interpreter, compiled emulation, pipelined VLIW architecture
Prototyping Compiler and Simulation Tools with PCCTS
"... This paper describes our experiences using PCCTS in our optimizing compiler and simulator development effort. The tools are used in our C front end, a code scheduler, a linker, an instruction level simulator, and a detailed cycle-level simulator. One of the Antlr grammars is used twice in the com ..."
Abstract
- Add to MetaCart
This paper describes our experiences using PCCTS in our optimizing compiler and simulator development effort. The tools are used in our C front end, a code scheduler, a linker, an instruction level simulator, and a detailed cycle-level simulator. One of the Antlr grammars is used twice in the compiler and in two different simulators. This type of reuse and flexibility is a strength of PCCTS. A working knowledge of PCCTS made the rapid implementation of a variety of compilers and simulators possible with very little manpower. Keywords: PCCTS, parsing, grammar, front-end, instruction level simulation. 1 This work was supported by contract No. DAAL02-89-C-0038 between the Army Research Office and the University of Minnesota for the Army High Performance Computing Research Center, Office of Naval Research Grant No. N00014-93-1-0426, and National Science Foundation Grant NSF CCR-9110261. 1 Introduction The advanced architectures research group at the University of Minnesota is in...
An Energy-oriented Retargetable Simulator for Instruction-Set Architecture
"... Abstract- Retargetability is typically achieved by providing target machine information, ADL, as input. The ADL are used to specify processor and memory architectures and generate software toolkit including compiler, simulator, assembler, profiler, and debugger. Simulators are critical components of ..."
Abstract
- Add to MetaCart
Abstract- Retargetability is typically achieved by providing target machine information, ADL, as input. The ADL are used to specify processor and memory architectures and generate software toolkit including compiler, simulator, assembler, profiler, and debugger. Simulators are critical components of the exploration and software design toolkit for the system designer. Instruction-Set Architecture(ISA) simulator an integral part of today's processor and software. In this paper, we design and implement the energy-oriented simulation environments to reduce the time and cost of simulator development through the retargetable technique for ISA. To accomplish our research objectives and goals, Firstly, we describes the energy consumption estimation and monitoring information on the ADL based on EXPRESSION. Secondly, we generate the energy estimation and monitoring simulation library and then construct the simulator. Lastly, we represent the energy estimations results for MIPS R4000 ADL description. From this subject, we contribute to the efficient architecture developments and prompt SDK generation through programmable experiments in the field of mobile software development.

