Results 1 - 10
of
13
A Callgraph-Based Search Strategy for Automated Performance Diagnosis
, 2000
"... We introduce a new technique for automated performance diagnosis, using the program's callgraph. We discuss our implementation of this diagnosis technique in the Paradyn Performance Consultant. Our implementation includes the new search strategy and new dynamic instrumentation to resolve pointe ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
We introduce a new technique for automated performance diagnosis, using the program's callgraph. We discuss our implementation of this diagnosis technique in the Paradyn Performance Consultant. Our implementation includes the new search strategy and new dynamic instrumentation to resolve pointer-based dynamic call sites at run-time. We compare the effectiveness of our new technique to the previous version of the Performance Consultant for several sequential and parallel applications. Our results show that the new search method performs its search while inserting dramatically less instrumentation into the application, resulting in reduced application perturbation and consequently a higher degree of diagnosis accuracy.
The Search for Lost Cycles: A New Approach to Parallel Program Performance Evaluation
- In Proceedings of Supercomputing '94
, 1993
"... Traditional performance debugging and tuning of parallel programs is based on the "measuremodify " approach, in which detailed measurements of program executions are used to guide incremental changes to the program that result in better performance. Unfortunately, the performance of a parallel algor ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
Traditional performance debugging and tuning of parallel programs is based on the "measuremodify " approach, in which detailed measurements of program executions are used to guide incremental changes to the program that result in better performance. Unfortunately, the performance of a parallel algorithm is often related to its implementation, input data, and machine characteristics in surprising ways, and the "measure-modify" approach is unsuited to exploring these relationships fully: it is too heavily dependent on experimentation and measurement, which is impractical for studying the large number of variables that can affect parallel program performance. In this paper we argue that the problem of selecting the best implementation of a parallel algorithm requires a new approach to parallel program performance evaluation, one with a greater balance between measurement and modeling. We first present examples that demonstrate that different parallelizations of a program may be necessary ...
Experiment Management Support for Performance Tuning
- PROCEEDINGS OF THE SC’97 CONFERENCE
, 1997
"... The development of a high-performance parallel system or application is an evolutionary process -- both the code and the environment go through many changes during a program's lifetime -- and at each change, a key question for developers is: how and how much did the performance change? No existing ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
The development of a high-performance parallel system or application is an evolutionary process -- both the code and the environment go through many changes during a program's lifetime -- and at each change, a key question for developers is: how and how much did the performance change? No existing performance tool provides the necessary functionality to answer this question. This paper reports on the design and preliminary implementation of a tool which views each execution as a scientific experiment and provides the functionality to answer questions about a program's performance which span more than a single execution or environment. We report results of using our tool with an actual performance tuning study and with a scientific application run in changing environments. Our goal is to use historic program performance data to develop techniques for parallel program performance diagnosis.
Speedy: An Integrated Performance Extrapolation Tool for pC++ Programs
- In Quantitative Evaluation of Computing and Communication Systems: Proceedings of the 8th International Conference on Modelling Techniques and Tools for Computer Performance Evaluation, volume 977 of Lecture Notes in Computer Science
, 1995
"... . Performance extrapolation is the process of evaluating the performance of a parallel program in a target execution environment using performance information obtained for the same program in a different environment. Performance extrapolation techniques are suited for rapid performance tuning of par ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
. Performance extrapolation is the process of evaluating the performance of a parallel program in a target execution environment using performance information obtained for the same program in a different environment. Performance extrapolation techniques are suited for rapid performance tuning of parallel programs, particularly when the target environment is unavailable. This paper describes one such technique that was developed for data-parallel C++ programs written in the pC++ language. In pC++, the programmer can distribute a collection of objects to various processors and can have methods invoked on those objects execute in parallel. Using performance extrapolation in the development of pC++ applications allows tuning decisions to be made in advance of detailed execution measurements. The pC++ language system includes t, an integrated environment for analyzing and tuning the performance of pC++ programs. This paper presents speedy, a new addition to t, that predicts the performa...
Finding Bottlenecks In Large Scale Parallel Programs
, 1994
"... This thesis addresses the problem of trying to locate the source of performance bottlenecks in large-scale parallel and distributed applications. Performance monitoring creates a dilemma: identifying a bottleneck necessitates collecting detailed information, yet collecting all this data can introduc ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This thesis addresses the problem of trying to locate the source of performance bottlenecks in large-scale parallel and distributed applications. Performance monitoring creates a dilemma: identifying a bottleneck necessitates collecting detailed information, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex graphs and tables that require a performance expert to interpret. I have developed a new approach that addresses both these problems by combining dynamic on-the-fly selection of what performance data to collect with decision support to assist users with the selection and presentation of performance data. The approach is called the W 3 Search Model. To make it possible to implement the W 3 Search Model, I have developed a new monitoring technique for parallel programs called Dynamic Instrumentation. The premise of my work is that not only is it possible to do on-line performance debu...
Experiment Management Support for Parallel Performance Tuning
- UNIVERSITY OF WISCONSIN - MADISON
, 1999
"... ..."
Automating Runtime Optimizations For Parallel Object-Oriented Programming
, 1996
"... Software development for parallel computers has been recognized as one of the bottlenecks preventing their widespread use. In this thesis we examine two complementary approaches for addressing the challenges of high performance and enhanced programmability in parallel programs: automated optimizatio ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Software development for parallel computers has been recognized as one of the bottlenecks preventing their widespread use. In this thesis we examine two complementary approaches for addressing the challenges of high performance and enhanced programmability in parallel programs: automated optimizations and object-orientation. We have developed the parallel object-oriented language Charm++ (an extension of C++), which enables the benefits of objectorientation to be applied to the problems of parallel programming. In order to improve parallel program performance without extra effort, we explore the use of automated optimizations. In particular, we have developed techniques for automating run-time optimizations for parallel object-oriented languages. These techniques have been embodied in the Paradise post-mortem analysis tool which automates several run-time optimizations without programmer intervention. Paradise builds a program representation from traces, analyzes characteristics, choos...
PDRS: A Performance Data Representation System
- in Proc. of 5th IEEE International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS2000
"... We present the design and development of a Performance Data Representation System (PDRS) for scalable parallel computing. PDRS provides decision support that helps users find the right data to understand their programs' performance and to select appropriate ways to display and analyze it. PDRS is an ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present the design and development of a Performance Data Representation System (PDRS) for scalable parallel computing. PDRS provides decision support that helps users find the right data to understand their programs' performance and to select appropriate ways to display and analyze it. PDRS is an attempt to provide appropriate assistant to help programmers identifying performance bottlenecks and optimizing their programs.
Near-Critical Path Analysis Of Parallel Program Performance: The Statistical Perspective
"... evaluation of the effectiveness of the Maximum Benefit metric. ii ACKNOWLEDGEMENTS First, I would like to extend sincere thanks and appreciation to Dr. Donna Reese and Dr. Cedell Alexander for their part in this research. I am grateful to Dr. Reese for giving me the opportunity to work for her, an ..."
Abstract
- Add to MetaCart
evaluation of the effectiveness of the Maximum Benefit metric. ii ACKNOWLEDGEMENTS First, I would like to extend sincere thanks and appreciation to Dr. Donna Reese and Dr. Cedell Alexander for their part in this research. I am grateful to Dr. Reese for giving me the opportunity to work for her, and I am thankful for her patience and guidance. I am indebted to Dr. Alexander both for the willingness to share his research efforts and for the direction he has given to mine. Secondly, I would like to thank Dr. Susan Bridges and Dr. Rainey Little for their time and efforts as members of my graduate committee. Finally, I would like to thank the members of the System Software thrust at the MSU/NSF Engineering Research Center and the members of the Computer Science department faculty in general for all of the knowledge that they have passed on to me. iii TABLE OF CONTENTS Page ACKNOWLEDGEMENTS ii .............................................. LIST OF TABLES vi .............................

