Results 1 - 10
of
14
DiST A Simple, Reliable, and Scalable Method to Significantly Reduce
- Processor Architecture Simulation Time,” Proc. Joint Int’l Conf. Measurement and Modeling of Computer Systems
, 2003
"... While architecture simulation is often treated as a methodology issue, it is at the core of most processor architecture research works, and simulation speed is often the bottleneck of the typical trialand-error research process. To speedup simulation during this research process and get trends faste ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
While architecture simulation is often treated as a methodology issue, it is at the core of most processor architecture research works, and simulation speed is often the bottleneck of the typical trialand-error research process. To speedup simulation during this research process and get trends faster, researchers usually reduce the trace size. More sophisticated techniques like trace sampling or distributed simulation are scarcely used because they are considered unreliable and complex due to their impact on accuracy and the associated warm-up issues. In this article, we present DiST, a practical distributed simulation scheme where, unlike in other simulation techniques that trade accuracy for speed, the user is relieved from most accuracy issues thanks to an automatic and dynamic mechanism for adjusting the warm-up interval size. Moreover, the mechanism is designed so as to always privilege accuracy over speedup. The speedup scales with the amount of available computing resources, bringing an average 7.35 speedup on 10 machines with an average IPC error of 1.81 % and a maximum IPC error of 5.06%. Besides proposing a solution to the warm-up issues in distributed simulation, we experimentally show that our technique is significantly more accurate than trace size reduction or trace sampling for identical speedups. We also show that not only the error always remains small for IPC and other metrics, but that a researcher can reliably base research decisions on DiST simulation results. Finally, we explain how the DiST tool is designed to be easily pluggable into existing architecture simulators with very few modifications.
A Novel Methodology for the Design of Application-Specific Instruction-Set Processors (ASIPs) Using a Machine Description Language
- UNIVERSITY OF DORTMUND
, 2001
"... The development of application-specific instruction -set processors (ASIP) is currently the exclusive domain of the semiconductor houses and core vendors. This is due to the fact that building such an architecture is a difficult task that requires expertise in different domains: application software ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
The development of application-specific instruction -set processors (ASIP) is currently the exclusive domain of the semiconductor houses and core vendors. This is due to the fact that building such an architecture is a difficult task that requires expertise in different domains: application software development tools, processor hardware implementation, and system integration and verification. This article presents a retargetable framework for ASIP design which is based on machine descriptions in the LISA language. From that, software development tools can be generated automatically including high-level language C compiler, assembler, linker, simulator, and debugger frontend. Moreover, for architecture implementation, synthesizable hardware description language code can be derived, which can then be processed by standard synthesis tools. Implementation results for a low-power ASIP for digital video broadcasting terrestrial acquisition and tracking algorithms designed with the presented methodology will be given. To show the quality of the generated software development tools, they are compared in speed and functionality with commercially available tools of state-of-the-art digital signal processor and µC architectures.
Retargetable compiled simulation of embedded processors using a machine description language
- ACM Transactions on Design Automation of Electronic Systems
, 2000
"... Fast processor simulators are needed for the software development ofembedded processors, for HW/SW cosimulation systems and for profiling and design of application specific processors. Such fast simulators can be generated based on the machine description language LISA. Using this language to model ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Fast processor simulators are needed for the software development ofembedded processors, for HW/SW cosimulation systems and for profiling and design of application specific processors. Such fast simulators can be generated based on the machine description language LISA. Using this language to model processor architectures enables the generation of compiled simulators on various abstraction levels, assemblers and compiler back-ends. The article discusses the requirements of software development tools on processor models and presents the approach based on the LISA language. Furthermore, the implementation of a retargetable environment consisting of compiled simulator, debugger and assembler is presented. Measurements for a verified, cycle-based LISA model of the TI TMS320C62x DSP show that this approach achieves between 37x and 170x higher simulation speed compared to a commercial simulator using a standard technique and the same accuracy level.
A Framework for Memory Subsystem Exploration
, 2002
"... Memory represents a major bottleneck in modern embedded systems in terms of cost, power and performance. Traditionally, memory organizations for programmable systems assume a fixed cache hierarchy. With the widening processor-memory gap, more aggressive memory technologies and organizations have app ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Memory represents a major bottleneck in modern embedded systems in terms of cost, power and performance. Traditionally, memory organizations for programmable systems assume a fixed cache hierarchy. With the widening processor-memory gap, more aggressive memory technologies and organizations have appeared, allowing customization of a heterogeneous memory architecture tuned for the application. However, such a processor-memory co-exploration approach critically needs the ability to explicitly capture heterogeneous memory architectures. We present in this paper a language-based approach to explicitly capture the memory subsystem configuration, and perform exploration of the memory architecture to meet diverse requirements: low power, better performance, smaller die size etc. We present a set of experiments using our Memory-Aware Architectural Description Language to drive the exploration of the memory subsystem for the TI C6211 processor architecture, demonstrating a range of cost, performance, and energy attributes.
Using Static Scheduling Techniques for the Retargeting of High Speed, Compiled Simulators for Embedded Processors from an Abstract . . .
- In Proc.oftheInt. Symposium on System Synthesis
, 2001
"... Machine Description Gunnar Braun, Andreas Hoffmann, Achim Nohl, Heinrich Meyr Aachen University of Technology (RWTH) Institute for Integrated Signal Processing Systems Templergraben 55 52056 Aachen, Germany braun[hoffmann,nohl,meyr]@iss.rwth-aachen.de ABSTRACT Instruction set simulators are ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Machine Description Gunnar Braun, Andreas Hoffmann, Achim Nohl, Heinrich Meyr Aachen University of Technology (RWTH) Institute for Integrated Signal Processing Systems Templergraben 55 52056 Aachen, Germany braun[hoffmann,nohl,meyr]@iss.rwth-aachen.de ABSTRACT Instruction set simulators are indispensable tools for both the design of programmable architectures and software development. However, due to a constantly increasing processor complexity and the frequent demand for cycle-accurate models, such simulators have become defectively slow. The principle of compiled simulation addresses this shortcoming. Compiled simulators make use of a priori knowlegde to accelerate simulation, with the highest efficieny achieved by employing static scheduling techniques.
Retargetable Cache Simulation Using High Level Processor Models
- In Proceedings of the 6th Australasian Computer Systems Architecture Conference, Gold
, 2001
"... During processor design, it is often necessary to evaluate multiple cache configurations. This paper describes the design and implementation of a retargetable on-line cache simulator. The cache simulator has been implemented using a retargetable instruction set simulator from the SimnML [9] processo ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
During processor design, it is often necessary to evaluate multiple cache configurations. This paper describes the design and implementation of a retargetable on-line cache simulator. The cache simulator has been implemented using a retargetable instruction set simulator from the SimnML [9] processor description language. The retargetability helps in cache simulation and evaluation much before the actual processor design.
Processor-memory co-exploration using an architecture description language
- ACM Transactions on Embedded Computing Systems (TECS
"... Memory represents a major bottleneck in modern embedded systems in terms of cost, power, and performance. Traditionally, memory organizations for programmable embedded systems assume a fixed cache hierarchy. With the widening processor–memory gap, more aggressive memory technologies and organization ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Memory represents a major bottleneck in modern embedded systems in terms of cost, power, and performance. Traditionally, memory organizations for programmable embedded systems assume a fixed cache hierarchy. With the widening processor–memory gap, more aggressive memory technologies and organizations have appeared, allowing customization of a heterogeneous memory architecture tuned for specific target applications. However, such a processor–memory coexploration approach critically needs the ability to explicitly capture heterogeneous memory architectures. We present in this paper a language-based approach to explicitly capture the memory subsystem configuration, generate a memory-aware software toolkit, and perform coexploration of the processor–memory architectures. We present a set of experiments using our memory-aware architectural description language (ADL) to drive the exploration of the memory subsystem for the TI C6211 processor architecture, demonstrating cost, performance, and energy trade-offs.
RTL Processor Synthesis for Architecture Exploration and Implementation
- in DATE 2004: Conference on Design, Automation & Test in Europe, 2004. [Online]. Available: citeseer.ist.psu.edu/schliebusch04rtl.html
, 2004
"... Architecture description languages are widely used to perform architecture exploration for application-driven designs, whereas the RT-level is the commonly accepted level for hardware implementation. For this reason, design parameters such as timing, area or power consumption cannot be taken into co ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Architecture description languages are widely used to perform architecture exploration for application-driven designs, whereas the RT-level is the commonly accepted level for hardware implementation. For this reason, design parameters such as timing, area or power consumption cannot be taken into consideration accurately during design space exploration. Design automation tools currently used to bridge this gap are either limited in the flexibility provided or only generate fragments of the architecture. This paper presents a synthesis tool which preserves the full flexibility of the architecture description language LISA, while being able to generate the complete architecture on RT-level using SystemC. This paper also presents two real world architecture case studies to prove the feasibility of our approach. 1
Generation of GCC Backend from Sim-nML Processor Description
, 2001
"... Increasing importance of software in embedded systems led to the paradigm of hardware-software codesign, which advocates for early integration of hardware and software, even before the hardware design is complete. To support this paradigm a set of tools are needed that can simulate the build and exe ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Increasing importance of software in embedded systems led to the paradigm of hardware-software codesign, which advocates for early integration of hardware and software, even before the hardware design is complete. To support this paradigm a set of tools are needed that can simulate the build and execution environment of hardware. The approach is developed in our group where a high-level specication of hardware is written and from which the tools assembler, linker, compiler, simulator, high-level synthesizer etc. are generated automatically.
A methodology and tooling enabling application specific processor design
- Proc. 18th Int. Conf VLSI Design, 2005, Page(s) 399–404. Proc. Intl. Symp. on Systems-on-Chip (SoC 2006), Tampere, Nov. 2006 Page 4 of 4 © 2006, IEEE
"... This paper presents a highly efficient processor design methodology based on the LISA 2.0 language. Typically the architecture design phase is dominated by an iterative processor model refinement based on the results of hardware and software simulation and profiling. Thus, traditionally huge teams o ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents a highly efficient processor design methodology based on the LISA 2.0 language. Typically the architecture design phase is dominated by an iterative processor model refinement based on the results of hardware and software simulation and profiling. Thus, traditionally huge teams of hardware and software experts are required to design new programmable architectures. The proposed design flow reduces the design time and enables even non processor experts to overcome the typical design challenges. The presented design methodology is based on a workbench that automates the generation of all required software tools and furthermore closes the gap between high level modeling and hardware implementation via automatic generation of a Register Transfer Level (RTL) model for the target processor. A case study demonstrates the design approach discussing the application specific instruction-set processor (ASIP) design for a Fast Fourier Transformation (FFT) algorithm. Several processor types such as SIMD and VLIW with various characteristics have been explored to find an optimal processor implementation for this algorithm.

