Results 1 -
2 of
2
Performance Analysis Using the MIPS R10000 Performance Counters
, 1996
"... : Tuning supercomputer application performance often requires analyzing the interaction of the application and the underlying architecture. In this paper, we describe support in the MIPS R10000 for non-intrusively monitoring a variety of processor events -- support that is particularly useful for c ..."
Abstract
-
Cited by 91 (0 self)
- Add to MetaCart
: Tuning supercomputer application performance often requires analyzing the interaction of the application and the underlying architecture. In this paper, we describe support in the MIPS R10000 for non-intrusively monitoring a variety of processor events -- support that is particularly useful for characterizing the dynamic behavior of multi-level memory hierarchies, hardware-based cache coherence, and speculative execution. We first explain how performance data is collected using an integrated set of hardware mechanisms, operating system abstractions, and performance tools. We then describe several examples drawn from scientific applications, which illustrate how the counters and profiling tools provide information that helps developers analyze and tune applications. Keywords: performance analysis, profiling tools, hardware performance counters, MIPS R10000, SGI Power Challenge 1. Introduction A fundamental question asked by HPC application developers is: "Where is the time spent?"...
Workload Analysis of Computation Intensive Tasks: Case Study on SPEC CPU95 Benchmarks
- In Proc. Euro-Par
, 1997
"... . Several performance analysis tools have been developed with the drawback of dedicated hardware solutions or the compute intenseness of simulations. The modern microprocessors, with hardware support for counting of system hardware events, now make possible universal software tools for the perfo ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
. Several performance analysis tools have been developed with the drawback of dedicated hardware solutions or the compute intenseness of simulations. The modern microprocessors, with hardware support for counting of system hardware events, now make possible universal software tools for the performance analysis of complex application programs such as the SPEC benchmarks. In this paper, we present a new method to determine system resource utilization #cache miss ratios, CPI values, branch miss predictions# of arbitrary programs, based on a sampling technique, combined with access to processor-internal event counter registers. We present the sprof tool set that is based on this method and enables also the detailed analysis of individual subroutines of a program, as they are executed over time. The high accuracy and the negligible overhead of the tool set is demonstrated. We used the SPEC95 benchmark suite, consisting of 8 integer and 10 #oating-pointintensive non-trivial pro...

