Results 1 - 10
of
20
The Paradyn Parallel Performance Measurement Tools
- IEEE Computer
, 1995
"... Paradyn is a performance measurement tool for parallel and distributed programs. Paradyn uses several novel technologies so that it scales to long running programs (hours or days) and large (thousand node) systems, and automates much of the search for performance bottlenecks. It can provide precise ..."
Abstract
-
Cited by 353 (28 self)
- Add to MetaCart
Paradyn is a performance measurement tool for parallel and distributed programs. Paradyn uses several novel technologies so that it scales to long running programs (hours or days) and large (thousand node) systems, and automates much of the search for performance bottlenecks. It can provide precise performance data down to the procedure and statement level. Paradyn is based on a dynamic notion of performance instrumentation and measurement. Unmodified executable files are placed into execution and then performance instrumentation is inserted into the application program and modified during execution. The instrumentation is controlled by the Performance Consultant module, that automatically directs the placement of instrumentation. The Performance Consultant has a well-defined notion of performance bottlenecks and program structure, so that it can associate bottlenecks with specific causes and specific parts of a program. Paradyn controls its instrumentation overhead by monitoring the cost of its data collection, limiting its instrumentation to a (user controllable) threshold. The instrumentation in Paradyn can easily be configured to accept new operating system, hardware, and application specific performance data. It also provides an open interface for performance visualization, and a simple programming library to allow these visualizations to interface to Paradyn. Paradyn can gather and present performance data in terms of high-level parallel languages (such as data parallel Fortran) and can measure programs on massively parallel computers, workstation clusters, and heterogeneous combinations of these systems. 1.
An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs
, 2000
"... Supporting source-level performance analysis of programs written in data-parallel languages requires a unique degree of integration between compilers and performance analysis tools. Compilers for languages such as High Performance Fortran infer parallelism and communication from data distribution ..."
Abstract
-
Cited by 58 (8 self)
- Add to MetaCart
Supporting source-level performance analysis of programs written in data-parallel languages requires a unique degree of integration between compilers and performance analysis tools. Compilers for languages such as High Performance Fortran infer parallelism and communication from data distribution directives# thus, performance tools cannot meaningfully relate measurements about these key aspects of execution performance to source-level constructs without substantial compiler support. This paper describes an integrated system for performance analysis of data-parallel programs based on the Rice Fortran 77D compiler and the Illinois Pablo performance analysis toolkit. During code generation, the Fortran D compiler records mapping information and semantic analysis results describing the relationship between performance instrumentation and the original source program. An integrated performance analysis system based on the Pablo toolkit uses this information to correlate the program's d...
A Callgraph-Based Search Strategy for Automated Performance Diagnosis
, 2000
"... We introduce a new technique for automated performance diagnosis, using the program's callgraph. We discuss our implementation of this diagnosis technique in the Paradyn Performance Consultant. Our implementation includes the new search strategy and new dynamic instrumentation to resolve pointe ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
We introduce a new technique for automated performance diagnosis, using the program's callgraph. We discuss our implementation of this diagnosis technique in the Paradyn Performance Consultant. Our implementation includes the new search strategy and new dynamic instrumentation to resolve pointer-based dynamic call sites at run-time. We compare the effectiveness of our new technique to the previous version of the Performance Consultant for several sequential and parallel applications. Our results show that the new search method performs its search while inserting dramatically less instrumentation into the application, resulting in reduced application perturbation and consequently a higher degree of diagnosis accuracy.
Improving Online Performance Diagnosis by the Use of Historical Performance Data
- In Proc. SC'99
, 1999
"... Accurate performance diagnosis of parallel and distributed programs is a difficult and time-consuming task. We describe a new technique that uses historical performance data, gathered in previous executions of an application, to increase the effectiveness of automated performance diagnosis. We incor ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Accurate performance diagnosis of parallel and distributed programs is a difficult and time-consuming task. We describe a new technique that uses historical performance data, gathered in previous executions of an application, to increase the effectiveness of automated performance diagnosis. We incorporate several different types of historical knowledge about the application's performance into an existing profiling tool, the Paradyn Parallel Performance Tool. We gather performance and structural data from previous executions of the same program, extract knowledge useful for diagnosis from this collection of data in the form of search directives, then input the directives to an enhanced version of Paradyn, which conducts a directed online diagnosis. Compared to existing approaches, incorporating historical data shortens the time required to identify bottlenecks, decreases the amount of unhelpful instrumentation, and improves the usefulness of the information obtained from a diagnostic se...
Experiment Management Support for Performance Tuning
- PROCEEDINGS OF THE SC’97 CONFERENCE
, 1997
"... The development of a high-performance parallel system or application is an evolutionary process -- both the code and the environment go through many changes during a program's lifetime -- and at each change, a key question for developers is: how and how much did the performance change? No existing ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
The development of a high-performance parallel system or application is an evolutionary process -- both the code and the environment go through many changes during a program's lifetime -- and at each change, a key question for developers is: how and how much did the performance change? No existing performance tool provides the necessary functionality to answer this question. This paper reports on the design and preliminary implementation of a tool which views each execution as a scientific experiment and provides the functionality to answer questions about a program's performance which span more than a single execution or environment. We report results of using our tool with an actual performance tuning study and with a scientific application run in changing environments. Our goal is to use historic program performance data to develop techniques for parallel program performance diagnosis.
FINESSE: A Prototype Feedback-guided Performance Enhancement System
, 2000
"... FINESSE is a prototype environment designed to support rapid development of parallel programs for single-addressspace computers by both expert and non-expert programmers. The environment provides semi-automatic support for systematic, feedback-based reduction of the various classes of overhead assoc ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
FINESSE is a prototype environment designed to support rapid development of parallel programs for single-addressspace computers by both expert and non-expert programmers. The environment provides semi-automatic support for systematic, feedback-based reduction of the various classes of overhead associated with parallel execution. The characterisation of parallel performance by overhead analysis is first reviewed, and then the functionality provided by FINESSE is described. The utility of this environment is demonstrated by using it to develop parallel implementations, for an SGI Origin 2000 platform, of Tred2, a wellknown benchmark for automatic parallelising compilers.
Automatic Overheads Profiler for OpenMP Codes
, 2000
"... To develop a good parallel implementation requires understanding of where run-time is spent and comparing this to some realistic best possible time. ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
To develop a good parallel implementation requires understanding of where run-time is spent and comparing this to some realistic best possible time.
LBF: A Performance Metric for Program Reorganization
, 1998
"... We introduce a new performance metric, called Load Balancing Factor (LBF), to assist programmers with evaluating different tuning alternatives. The LBF metric differs from traditional performance metrics since it is intended to measure the performance implications of a specific tuning alternative ra ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
We introduce a new performance metric, called Load Balancing Factor (LBF), to assist programmers with evaluating different tuning alternatives. The LBF metric differs from traditional performance metrics since it is intended to measure the performance implications of a specific tuning alternative rather than quantifying where time is spent in the current version of the program. A second unique aspect of the metric is that it provides guidance about moving work within a distributed or parallel program rather than reducing it. A variation of the LBF metric can also be used to predict the performance impact of changing the underlying network. The LBF metric can be computed incrementally and online during the execution of the program to be tuned. We also present a case study that shows that our metric can predict the actual performance gains accurately for a test suite of six programs. 1. Introduction To successfully tune a distributed or parallel program, the cause of a performance bottl...
Finding Bottlenecks In Large Scale Parallel Programs
, 1994
"... This thesis addresses the problem of trying to locate the source of performance bottlenecks in large-scale parallel and distributed applications. Performance monitoring creates a dilemma: identifying a bottleneck necessitates collecting detailed information, yet collecting all this data can introduc ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This thesis addresses the problem of trying to locate the source of performance bottlenecks in large-scale parallel and distributed applications. Performance monitoring creates a dilemma: identifying a bottleneck necessitates collecting detailed information, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex graphs and tables that require a performance expert to interpret. I have developed a new approach that addresses both these problems by combining dynamic on-the-fly selection of what performance data to collect with decision support to assist users with the selection and presentation of performance data. The approach is called the W 3 Search Model. To make it possible to implement the W 3 Search Model, I have developed a new monitoring technique for parallel programs called Dynamic Instrumentation. The premise of my work is that not only is it possible to do on-line performance debu...
Experiment Management Support for Parallel Performance Tuning
- UNIVERSITY OF WISCONSIN - MADISON
, 1999
"... ..."

