Results 1 - 10
of
33
Measuring Experimental Error in Microprocessor Simulation
, 2001
"... We measure the ex4fl840E468 error that arises from the use of non-validated simulators in computer architecture research, with the goal of increasing the rigor of simulation -based studies. We describe the methodology that we used to validate a microprocessor simulator against a Compaq DS-10L work ..."
Abstract
-
Cited by 105 (28 self)
- Add to MetaCart
We measure the ex4fl840E468 error that arises from the use of non-validated simulators in computer architecture research, with the goal of increasing the rigor of simulation -based studies. We describe the methodology that we used to validate a microprocessor simulator against a Compaq DS-10L workstation, which contains an Alpha 21264 processor. Our evaluation suite consists of a set of 21 microbenchmarks that stress different aspects of the 21264 microarchitecture. Using the microbenchmark suite as the set of workloads, we describe how we reduced our simulator error to an arithmetic mean of 2%, and include details about the specific aspects of the pipeline that required ex8 a care to reduce the error. We show how these low-level optimizations reduce average error from 40% to less than 20% on macrobenchmarks drawn from the SPEC2000 suite. Finally, we exI837 the degree to which performance optimizations are stable across different simulators, showing that researchers would draw differ...
Microarchitectural Exploration with Liberty
- IN PROCEEDINGS OF THE 35TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE
, 2002
"... To find the best designs, architects must rapidly simulate many design alternatives and have confidence in the results. Unfortunately, the most prevalent simulator construction methodology, hand-writing monolithic simulators in sequential programming languages, yields simulators that are hard to ret ..."
Abstract
-
Cited by 80 (27 self)
- Add to MetaCart
To find the best designs, architects must rapidly simulate many design alternatives and have confidence in the results. Unfortunately, the most prevalent simulator construction methodology, hand-writing monolithic simulators in sequential programming languages, yields simulators that are hard to retarget, limiting the number of designs explored, and hard to understand, instilling little confidence in the model. Simulator construction tools have been developed to address these problems, but analysis reveals that they do not address the root cause, the error-prone mapping between the concurrent, structural hardware domain and the sequential, functional software domain. This paper presents an analysis of these problems and their solution, the Liberty Simulation Environment (LSE). LSE automatically constructs a simulator from a machine description that closely resembles the hardware, ensuring fidelity in the model. Furthermore, through a strict but general component communication contract, LSE enables the creation of highly reusable component libraries, easing the task of rapidly exploring ever more exotic designs.
Modeling Application Performance by Convolving Machine Signatures with Application Profiles
, 2001
"... This paper presents a performance modeling methodology that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-performance metrics, and is shown to be effective on a class of High Performance Computing benchmarks. The method ..."
Abstract
-
Cited by 34 (5 self)
- Add to MetaCart
This paper presents a performance modeling methodology that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-performance metrics, and is shown to be effective on a class of High Performance Computing benchmarks. The method yields insight into the factors that affect performance on single-processor and parallel computers.
A performance prediction framework for scientific applications
- ICCS Workshop on Performance Modeling and Analysis (PMA03
, 2003
"... Abstract. This work presents a performance modeling framework, developed by the Performance Modeling and Characterization (PMaC) Lab at the San Diego Supercomputer Center, that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-p ..."
Abstract
-
Cited by 32 (8 self)
- Add to MetaCart
Abstract. This work presents a performance modeling framework, developed by the Performance Modeling and Characterization (PMaC) Lab at the San Diego Supercomputer Center, that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-performance metrics, and is shown to be effective on the LINPACK benchmark and a synthetic version of an ocean modeling application (NLOM). The LINPACK benchmark is further used to investigate methods to reduce the time required to make accurate performance predictions with the framework. These methods are applied to the predictions of the synthetic NLOM application. 1
A Framework for Performance Modeling and Prediction
- IN SC 2002
, 2002
"... Cycle-accurate simulation is far too slow for modeling the expected performance of full parallel applications on large HPC systems. And just running an application on a system and observing wallclock time tells you nothing about why the application performs as it does (and is anyway impossible on ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
Cycle-accurate simulation is far too slow for modeling the expected performance of full parallel applications on large HPC systems. And just running an application on a system and observing wallclock time tells you nothing about why the application performs as it does (and is anyway impossible on yet-to-be-built systems). Here we present a framework for performance modeling and prediction that is faster than cycle-accurate simulation, more informative than simple benchmarking, and is shown useful for performance investigations in several dimensions.
Rapid development of a flexible validated processor model
- In Proceedings of the 2005 Workshop on Modeling, Benchmarking, and Simulation
, 2005
"... For a variety of reasons, most architectural evaluations use simulation models. An accurate baseline model validated against existing hardware provides confidence in the results of these evaluations. Meanwhile, a meaningful exploration of the design space requires a wide range of quickly-obtainable ..."
Abstract
-
Cited by 22 (11 self)
- Add to MetaCart
For a variety of reasons, most architectural evaluations use simulation models. An accurate baseline model validated against existing hardware provides confidence in the results of these evaluations. Meanwhile, a meaningful exploration of the design space requires a wide range of quickly-obtainable variations of the baseline. Unfortunately, these two goals are generally considered to be at odds; the set of validated models is considered exclusive of the set of easily malleable models. Vachharajani et al. challenge this belief and propose a modeling methodology they claim allows rapid construction of flexible validated models. Unfortunately, they only present anecdotal and secondary evidence to support their claims. In this paper, we present our experience using this methodology to construct a validated flexible model of Intel’s Itanium 2 processor. Our practical experience lends support to the above claims. Our initial model was constructed by a single researcher in only 11 weeks and predicts processor cycles-per-instruction (CPI) to within 7.9 % on average for the entire SPEC CINT2000 benchmark suite. Our experience with this model showed us that aggregate accuracy for a metric like CPI is not sufficient. Aggregate measures like CPI may conceal remaining internal “offsetting errors ” which can adversely affect conclusions drawn from the model. Using this as our motivation, we explore the flexibility of the model by modifying it to target specific error constituents, such as front-end stall errors. In 2 1 2 person-weeks, average CPI error was reduced to 5.4%. The targeted error constituents were reduced more dramatically; front-end stall errors were reduced from 5.6 % to 1.6%. The swift implementation of significant new architectural features on this model further demonstrated its flexibility. 1
On Hardware and Hardware Models for Embedded Real-Time Systems
, 2001
"... When building an embedded real-time systems, the choice of hardware platform is very important to create an analyzable and predictable system. Also, the quality of the models of the hardware used in software tools is very important to the correctness of timing analysis and the integrity of the syste ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
When building an embedded real-time systems, the choice of hardware platform is very important to create an analyzable and predictable system. Also, the quality of the models of the hardware used in software tools is very important to the correctness of timing analysis and the integrity of the system.
Applying an automated framework to produce accurate blind performance predictions of full-scale hpc applications
- In Proceedings of the 2004 Department of Defense Users Group Conference. IEEE Computer
, 2004
"... Abstract: This work builds on an existing performance modeling framework that has been proven effective on a variety of HPC systems. This paper will illustrate the framework’s power by creating blind predictions for three systems as well as establishing sensitivity studies to advance understanding o ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
Abstract: This work builds on an existing performance modeling framework that has been proven effective on a variety of HPC systems. This paper will illustrate the framework’s power by creating blind predictions for three systems as well as establishing sensitivity studies to advance understanding of observed and anticipated performance of both architecture and application. The predictions are termed blind because the results were completed without any knowledge of the real runtime of the applications; the real performance was then ascertained independently by a thirdparty. Two applications, Cobalt60 and HYCOM, were predicted to illustrate the frameworks accuracy and functionalities.
The Liberty Simulation Environment: A deliberate approach to high-level system modeling
- ACM Transactions on Computer Systems
, 2004
"... In digital hardware system design, the quality of the product is directly related to the number of meaningful design alternatives properly considered. Unfortunately, existing modeling methodologies and tools have properties which make them less than ideal for rapid and accurate designspace explorati ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
In digital hardware system design, the quality of the product is directly related to the number of meaningful design alternatives properly considered. Unfortunately, existing modeling methodologies and tools have properties which make them less than ideal for rapid and accurate designspace exploration. This article identifies and evaluates the shortcomings of existing methods to motivate the Liberty Simulation Environment (LSE). LSE is a high-level modeling tool engineered to address these limitations, allowing for the rapid construction of accurate high-level simulation models. LSE simplifies model specification with low-overhead component-based reuse techniques and an abstraction for timing control. As part of a detailed description of LSE, this article presents these features, their impact on model specification effort, their implementation, and optimizations created to mitigate their otherwise deleterious impact on simulator execution
Microarchitecture Modeling for Design-Space Exploration Design-Space Exploration
, 2004
"... To identify the best processor designs, designers explore a vast design space. To assess the quality of candidate designs, designers construct and use simulators. Unfortunately, simulator construction is a bottleneck in this design-space exploration because existing simulator construction methodolog ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
To identify the best processor designs, designers explore a vast design space. To assess the quality of candidate designs, designers construct and use simulators. Unfortunately, simulator construction is a bottleneck in this design-space exploration because existing simulator construction methodologies lead to long simulator development times. This bottleneck limits exploration to a small set of designs, potentially diminishing quality of the final design.

