Results 1 -
4 of
4
Analysis of Benchmark Characteristics and Benchmark Performance Prediction
- ACM Transactions on Computer Systems
, 1992
"... Standard benchmarking provides the run times for given programs on given machines, but fails to provide insight as to why those results were obtained (either in terms of machine or program characteristics), and fails to provide run times for that program on some other machine, or some other programs ..."
Abstract
-
Cited by 99 (4 self)
- Add to MetaCart
Standard benchmarking provides the run times for given programs on given machines, but fails to provide insight as to why those results were obtained (either in terms of machine or program characteristics), and fails to provide run times for that program on some other machine, or some other programs on that machine. We have developed a machineindependent model of program execution to characterize both machine performance and program execution. By merging these machine and program characterizations, we can estimate execution time for arbitrary machine/program combinations. Our technique allows us to identify those operations, either on the machine or in the programs, which dominate the benchmark results. This information helps designers in improving the performance of future machines, and users in tuning their applications to better utilize the performance of existing machines. Here we apply our methodology to characterize benchmarks and predict their execution times. We present extensi...
Performance Characterization of Optimizing Compilers
- IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 1992
"... Optimizing compilers have become an essential component in achieving high levels of performance. Various simple and sophisticated optimizations are implemented at different stages of compilation to yield significant improvements, but little work has been done in characterizing the effectiveness of o ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Optimizing compilers have become an essential component in achieving high levels of performance. Various simple and sophisticated optimizations are implemented at different stages of compilation to yield significant improvements, but little work has been done in characterizing the effectiveness of optimizers, or in understanding where most of this improvement comes from. In this paper we study the performance impact of optimization in the context of our methodology for CPU performance characterization based on the abstract machine model. The abstract machine model considers all machines to be different implementations of the same high level language machine; in previous research, we have used this model as a basis to analyze machine and benchmark performance. In this paper, we: 1) show that our model can be extended to characterize the performance improvement provided by optimizers and to predict the run time of optimized programs; 2) measure the effectiveness of several optimizing com...
Evaluation of Mechanisms for Fine-Grained Parallel Programs in the J-Machine and the CM-5
, 1993
"... This paper uses an abstract machine approach to compare the mechanisms of two parallel machines: the J-Machine and the CM-5. High-level parallel programs are translated by a single optimizing compiler to a finegrained abstract parallel machine, TAM. A final compilation step is unique to each machine ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
This paper uses an abstract machine approach to compare the mechanisms of two parallel machines: the J-Machine and the CM-5. High-level parallel programs are translated by a single optimizing compiler to a finegrained abstract parallel machine, TAM. A final compilation step is unique to each machine and optimizes for specifics of the architecture. By determining the cost of the primitives and weighting them by their dynamic frequency in parallel programs, we quantify the effectiveness of the followingmechanisms individuallyand in combination. Efficient processor/network coupling proves valuable. Message dispatch is found to be less valuable without atomic operations that allow the scheduling levels to cooperate. Multiple hardware contexts are of small value when the contexts cooperate and the compiler can partition the register set. Tagged memory provides little gain. Finally, the performance of the overall system is strongly influenced by the performance of the memory system and the f...
Measuring Cache and TLB Performance and Their Effect on Benchmark Run Times
- IEEE Transactions on Computers
, 1993
"... In previous research, we have developed and presented a model for measuring machines and analyzing programs, and for accurately predicting the running time of any analyzed program on any measured machine. That work is extended here by: (a) developing a high level program to measure the design and pe ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
In previous research, we have developed and presented a model for measuring machines and analyzing programs, and for accurately predicting the running time of any analyzed program on any measured machine. That work is extended here by: (a) developing a high level program to measure the design and performance of the cache and TLB for any machine; (b) using those measurements, along with published miss ratio data, to improve the accuracy of our run time predictions; (c) using our analysis tools and measurements to study and compare the design of several machines, with particular reference to their cache and TLB performance. As part of this work, we describe the design and performance of the cache and TLB for ten machines. The work presented in this paper extends a powerful technique for the evaluation and analysis of both computer systems and their workloads; this methodology is valuable both to computer users and computer system designers. 1. Introduction The performance of a computer ...

