Results 1 - 10
of
21
MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors
, 1994
"... Mint is a software package designed to ease the process of constructing event-driven memory hierarchy simulators for multiprocessors. It provides a set of simulated processors that run standard Unix executable files compiled for a MIPS R3000 based multiprocessor. These generate multiple streams of m ..."
Abstract
-
Cited by 170 (1 self)
- Add to MetaCart
Mint is a software package designed to ease the process of constructing event-driven memory hierarchy simulators for multiprocessors. It provides a set of simulated processors that run standard Unix executable files compiled for a MIPS R3000 based multiprocessor. These generate multiple streams of memory reference events that drive a user-provided memory system simulator. Mint uses a novel hybrid technique that exploits the best aspects of native execution and software interpretation to minimize the overhead of processor simulation. Combined with related techniques to improve performance, this approach makes simulation on uniprocessor hosts extremely efficient. 1 Introduction The simulation of high-performance computer systems is computationally expensive. Each unit of simulated processor execution time requires many units of simulator time. If the target is a parallel computer, simulator time necessarily grows in proportion to the total amount of work done in the simulated parallel ...
Trace-Driven Memory Simulation: A Survey
- ACM Computing Surveys
, 2004
"... This article surveys and analyzes these developments by establishing criteria for evaluating trace-driven methods, and then applies these criteria to describe, categorize, and compare over 50 trace-driven simulation tools. We discuss the strengths and weaknesses of different approaches and show t ..."
Abstract
-
Cited by 134 (0 self)
- Add to MetaCart
This article surveys and analyzes these developments by establishing criteria for evaluating trace-driven methods, and then applies these criteria to describe, categorize, and compare over 50 trace-driven simulation tools. We discuss the strengths and weaknesses of different approaches and show that no single method is best when all criteria, including accuracy, speed, memory, flexibility, portability, expense, and ease of use are considered. In a concluding section, we examine fundamental limitations to trace-driven simulation, and survey some recent developments in memory simulation that may overcome these bottlenecks
MINT Tutorial and User Manual
, 1994
"... This document describes Mint, a software package designed to ease the process of constructing event-driven memory hierarchy simulators for multiprocessors. It provides a set of simulated processors that run standard Unix executable files compiled for a MIPS R3000 based multiprocessor. These generate ..."
Abstract
-
Cited by 65 (1 self)
- Add to MetaCart
This document describes Mint, a software package designed to ease the process of constructing event-driven memory hierarchy simulators for multiprocessors. It provides a set of simulated processors that run standard Unix executable files compiled for a MIPS R3000 based multiprocessor. These generate multiple streams of memory reference events that drive a user-provided memory system simulator. Mint uses a novel hybrid technique that exploits the best aspects of native execution and software interpretation to minimize the overhead of processor simulation. Combined with related techniques to improve performance, this approach makes simulation on uniprocessor hosts extremely efficient. This material is based upon work supported by the National Science Foundation under Grant number CDA-8822724. The U. S. Government has certain rights in this material. Contents 1 Introduction 3 1.1 Trace-driven simulation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3 1.2 Program-driven s...
Software Support for Speculative Loads
- Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems
, 1992
"... This paper describes a simple hardware mechanism and related compiler support for software-controlled speculative loads. The compiler issues speculative load instructions based on anticipated data references and the ability of the memory system to hide memory latency in high-performance processors. ..."
Abstract
-
Cited by 62 (3 self)
- Add to MetaCart
This paper describes a simple hardware mechanism and related compiler support for software-controlled speculative loads. The compiler issues speculative load instructions based on anticipated data references and the ability of the memory system to hide memory latency in high-performance processors. The architectural support for such a mechanism is simple and minimal, yet handles faults gracefully. We have simulated the speculative load mechanism based on a MIPS processor and a detailed memory system. The results of scientific kernel loops indicate that our speculative load technique is an effective approaches to hiding memory latency. 1 Introduction The performance gap between processors and memory has widened in the last few years. In the last decade, microprocessor speeds have increased at a rate of 50% to 100% each year whereas DRAM speeds have increased at a rate of 10% or less each year [13]. As the performance gap becomes wider, high-performance processors become more sensitive...
Instruction Fetching: Coping with Code Bloat
- In Proceedings of the 22nd Annual International Symposium on Computer Architecture
, 1995
"... Previous research has shown that the SPEC benchmarks achieve low miss ratios in relatively small instruction caches. This paper presents evidence that current software-development practices produce applications that exhibit substantially higher instruction-cache miss ratios than do the SPEC benchmar ..."
Abstract
-
Cited by 62 (9 self)
- Add to MetaCart
Previous research has shown that the SPEC benchmarks achieve low miss ratios in relatively small instruction caches. This paper presents evidence that current software-development practices produce applications that exhibit substantially higher instruction-cache miss ratios than do the SPEC benchmarks. To represent these trends, we have assembled a collection of applications, called the Instruction Benchmark Suite (IBS), that provides a better test of instruction-cache performance. We discuss the rationale behind the design of IBS and characterize its behavior relative to the SPEC benchmark suite. Our analysis is based on trace-driven and trap-driven simulations and takes into full account both the application and operating-system components of the workloads. This paper then reexamines a collection of previously-proposed hardware mechanisms for improving instruction-fetch performance
Machine descriptions to build tools for embedded systems
- In ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems (LCTES’98), volume 1474 of LNCS
, 1998
"... - CSDL should support a variety of machine-level tools while remaining inde-pendent of any one in particular. ..."
Abstract
-
Cited by 45 (16 self)
- Add to MetaCart
- CSDL should support a variety of machine-level tools while remaining inde-pendent of any one in particular.
Automatic Generation of Microarchitecture Simulators
- In IEEE International Conference on Computer Languages
, 1998
"... In this paper we describe the UPFAST system that automatically generates a cycle level simulator, an assembler and a disassembler from a microarchitecture specification written in a domain specific language called the Architecture Description Language (ADL). Using the UPFAST system it is easy to ret ..."
Abstract
-
Cited by 32 (9 self)
- Add to MetaCart
In this paper we describe the UPFAST system that automatically generates a cycle level simulator, an assembler and a disassembler from a microarchitecture specification written in a domain specific language called the Architecture Description Language (ADL). Using the UPFAST system it is easy to retarget a simulator for an existing architecture to a modified architecture since one has to simply modify the input specification and the new simulator is generated automatically. UPFAST also allows porting of simulators to different platforms with minimal effort. We have been able to develop three simulators ranging from simple pipelined processors to complicated out-of-order issue processors over a short period of three months. While the specifications of the architectures varied from 5000 to 6000 lines of ADL code, the sizes of automatically generated software varied from 20,000 to 30,000 lines of C++ code. The automatically generated simulators are less than 2 times slower than hand coded...
Fast Specification of Cycle-Accurate Processor Models
- in Proc. ICCD, 2001
, 2001
"... This paper introduces a new specification style for processor microarchitectures. Our goal is to produce very simple, compact, but cycle-accurate descriptions, in order to enable early exploration of different microarchitectures and their performance. The key idea behind our approach is that we can ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper introduces a new specification style for processor microarchitectures. Our goal is to produce very simple, compact, but cycle-accurate descriptions, in order to enable early exploration of different microarchitectures and their performance. The key idea behind our approach is that we can derive the difficult-to-design forwarding and stall logic completely automatically. We have implemented a specification language for pipelined processors, along with an automatic translator that creates cycle-accurate software simulators from the specifications. We have specified a pipelined MIPS integer core in our language. The entire specification is less than 300 lines long and implements all user-mode instructions except for coprocessor support. The resulting, automatically-generated, cycle-accurate simulator achieves over 240,000 instructions per second simulating MIPS machine code. This performance is within an order of magnitude of large, hand-crafted, cycle-accurate simulators, but our specification is far easier to create, read, and modify.
A Methodology for Compilation of High-Integrity Real-Time Programs
, 1997
"... A practical methodology for compilation of trustworthy real-time programs is introduced. It combines new program development and timing analysis techniques with traditional compilation and assembly technologies. Keywords and phrases: Real-time programming; compilation; timing analysis. 1 Introd ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
A practical methodology for compilation of trustworthy real-time programs is introduced. It combines new program development and timing analysis techniques with traditional compilation and assembly technologies. Keywords and phrases: Real-time programming; compilation; timing analysis. 1 Introduction High-integrity real-time programs must always meet all their `hard' deadlines. Real-time code must exhibit not only correct functional behaviour, but predictable timing behaviour as well. Programming real-time systems in a highlevel language is difficult because it is the machine code generated by the compiler and assembler, not the high-level source program, that ultimately determines timing correctness. Contemporary compilers make no attempt to generate code with predictable timing characteristics [30, 28], undermining their value for real-time applications. Consequently, safety-critical real-time programs are typically written directly in assembler language, forsaking the well-est...

