Results 1 -
3 of
3
Breakpoints and Halting in Distributed Programs
, 1988
"... Interactive debugging requires that the programmer be able to halt a program at interesting points in its execution. This paper presents an algorithm for halting a distributed program in a consistent state, and presents a definition of distributed breakpoints with an algorithm for implementing the d ..."
Abstract
-
Cited by 69 (0 self)
- Add to MetaCart
Interactive debugging requires that the programmer be able to halt a program at interesting points in its execution. This paper presents an algorithm for halting a distributed program in a consistent state, and presents a definition of distributed breakpoints with an algorithm for implementing the detection of these breakpoints. The Halting Algorithm extends Chandy and Lamport's algorithm for recording global state and solves the problem of processes that are not fully connected or frequently communicating. The definition of distributed breakpoints is based on those events that can be detected in a distributed system. Events that can be partially ordered are detectable and form the basis for the breakpoint predicates, and from the breakpoint definition comes the description of an algorithm that can be used in a distributed debugger to detect these breakpoints. Index Items - Distributed Programming, Distributed Debugging, Halting Algorithm, Distributed Breakpoints. 1. Introduction Inte...
Dynamic Control of Performance Monitoring on Large Scale Parallel Systems
, 1993
"... Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex gra ..."
Abstract
-
Cited by 53 (10 self)
- Add to MetaCart
Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex graphs and tables that require a performance expert to interpret. We present a new approach called the W 3 Search Model, that addresses both these problems by combining dynamic on-the-fly selection of what performance data to collect with decision support to assist users with the selection and presentation of performance data. We present a case study describing how a prototype implementation of our technique was able to identify the bottlenecks in three real programs. In addition, we were able to reduce the amount of performance data collected by a factor ranging from 13 to 700 compared to traditional sampling and trace based instrumentation techniques. 1. Introduction Performance monitorin...
Finding Bottlenecks In Large Scale Parallel Programs
, 1994
"... This thesis addresses the problem of trying to locate the source of performance bottlenecks in large-scale parallel and distributed applications. Performance monitoring creates a dilemma: identifying a bottleneck necessitates collecting detailed information, yet collecting all this data can introduc ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This thesis addresses the problem of trying to locate the source of performance bottlenecks in large-scale parallel and distributed applications. Performance monitoring creates a dilemma: identifying a bottleneck necessitates collecting detailed information, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex graphs and tables that require a performance expert to interpret. I have developed a new approach that addresses both these problems by combining dynamic on-the-fly selection of what performance data to collect with decision support to assist users with the selection and presentation of performance data. The approach is called the W 3 Search Model. To make it possible to implement the W 3 Search Model, I have developed a new monitoring technique for parallel programs called Dynamic Instrumentation. The premise of my work is that not only is it possible to do on-line performance debu...

