Results 1 - 10
of
20
Critical Path Analysis of TCP Transactions
- IEEE/ACM Transactions on Networking
, 2000
"... Improving the performance of data transfers in the Internet (such as Web transfers) requires a detailed understanding of when and how delays are introduced. Unfortunately, the complexity of data transfers like those using HTTP is great enough that identifying the precise causes of delays is difficul ..."
Abstract
-
Cited by 66 (2 self)
- Add to MetaCart
Improving the performance of data transfers in the Internet (such as Web transfers) requires a detailed understanding of when and how delays are introduced. Unfortunately, the complexity of data transfers like those using HTTP is great enough that identifying the precise causes of delays is difficult. In this paper we describe a method for pinpointing where delays are introduced into applications like HTTP by using critical path analysis. By constructing and pro ling the critical path, it is possible to determine what fraction of total transfer latency is due to packet propagation, network variation (e.g., queuing at routers or route uctuation), packet losses, and delays at the server and at the client. We have implemented our technique in a tool called tcpeval that automates critical path analysis for Web transactions. We show that our analysis method is robust enough to analyze traces taken for two different TCP implementations (Linux and FreeBSD). To demonstrate the utility of our approach, we present the results of critical path analysis for a set of Web transactions taken over 14 days under a variety of server and network conditions. The results show that critical path analysis can shed considerable light on the causes of delays in Web transfers, and can expose subtleties in the behavior of the entire end-to-end system.
Dynamic Control of Performance Monitoring on Large Scale Parallel Systems
, 1993
"... Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex gra ..."
Abstract
-
Cited by 53 (10 self)
- Add to MetaCart
Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex graphs and tables that require a performance expert to interpret. We present a new approach called the W 3 Search Model, that addresses both these problems by combining dynamic on-the-fly selection of what performance data to collect with decision support to assist users with the selection and presentation of performance data. We present a case study describing how a prototype implementation of our technique was able to identify the bottlenecks in three real programs. In addition, we were able to reduce the amount of performance data collected by a factor ranging from 13 to 700 compared to traditional sampling and trace based instrumentation techniques. 1. Introduction Performance monitorin...
Critical Path Profiling of Message Passing and Shared-Memory Programs
- IEEE Transactions on Parallel and Distributed Systems
, 1998
"... In this paper, we introduce a runtime, nontrace-based algorithm to compute the critical path profile of the execution of message passing and shared-memory parallel programs. Our algorithm permits starting or stopping the critical path computation during program execution and reporting intermediate ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
In this paper, we introduce a runtime, nontrace-based algorithm to compute the critical path profile of the execution of message passing and shared-memory parallel programs. Our algorithm permits starting or stopping the critical path computation during program execution and reporting intermediate values. We also present an online algorithm to compute a variant of critical path, called critical path zeroing, that measures the reduction in application execution time that improving a selected procedure will have. Finally, we present a brief case study to quantify the runtime overhead of our algorithm and to show that online critical path profiling can be used to find program bottlenecks. Index Terms---Parallel and distributed processing, measurement, tools, program tuning, on-line evaluation. ------------------------------ ##p## ------------------------------ 1INTRODUCTION N performance tuning parallel programs, simple sums of sequential metrics, such as CPU utilization, do not ...
Experimental Analysis of Parallel Systems: Techniques and Open Problems
- Techniques and Open Problems, Lect. Notes in Comp. Sci. 794
, 1994
"... . Massively parallel systems pose daunting performance instrumentation and data analysis problems. Balancing instrumentation detail, application perturbation, data reduction costs, and presentation complexity requires a mix of science, engineering, and art. This paper surveys current techniques for ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
. Massively parallel systems pose daunting performance instrumentation and data analysis problems. Balancing instrumentation detail, application perturbation, data reduction costs, and presentation complexity requires a mix of science, engineering, and art. This paper surveys current techniques for performance instrumentation and data presentation, illustrates one approach to tool extensibility, and discusses the implications of massive parallelism for performance analysis environments. 1 Introduction The most constant difficulty in contriving the engine has arisen from the desire to reduce the time in which the calculations were executed to the shortest which is possible. Charles Babbage In the past one hundred and fifty years, little has changed since Babbage's remark. Performance optimization remains a difficult and elusive goal. And as we move from vector supercomputers to parallel systems that scale from tens to thousands of processors, many of the performance instrumentation, d...
Modeling, Measurement And Performance Of World Wide Web Transactions
, 2001
"... The size, diversity and continued growth of the World Wide Web combine to make its understanding difficult even at the most basic levels. The focus of our work is in developing novel methods for measuring and analyzing the Web which lead to a deeper understanding of its performance. We describe a me ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The size, diversity and continued growth of the World Wide Web combine to make its understanding difficult even at the most basic levels. The focus of our work is in developing novel methods for measuring and analyzing the Web which lead to a deeper understanding of its performance. We describe a methodology and a distributed infrastructure for taking measurements in both the network and end-hosts. The first unique characteristic of the infrastructure is our ability to generate requests at our Web server which closely imitate actual users. This ability is based on detailed analysis of Web client behavior and the creation of the Scalable URL Request Generator (SURGE) tool. SURGE provides us with the flexibility to test different aspects of Web performance. We demonstrate this flexibility in an evaluation of the 1.0 and 1.1 versions of the Hyper Text Transfer Protocol. The second unique aspect of our approach is that we analyze the details of Web transactions by applying critical path analysis (CPA). CPA enables us to precisely decompose latency in Web transactions into propagation delay, network variation, server delay, client delay and packet loss delays. We present analysis of performance data collected in our infrastructure. Our results show that our methods can expose surprising behavior in Web servers, and can yield considerable insight into the causes of delay variability in Web transactions.
An Online Computation of Critical Path Profiling
- SPDT'96: SIGMETRICS Symposium on Parallel and Distributed Tools. May
, 1996
"... In this paper we introduce a runtime, non-trace based algorithm to compute the critical path profile of the execution of a message passing parallel program. Our algorithm permits starting or stopping the critical path computation during program execution and reporting intermediate values. We also pr ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
In this paper we introduce a runtime, non-trace based algorithm to compute the critical path profile of the execution of a message passing parallel program. Our algorithm permits starting or stopping the critical path computation during program execution and reporting intermediate values. We also present an online algorithm to compute a variant of critical path, called critical path zeroing, that measures the reduction in application execution time that improving a selected procedure will have. Finally, we present a brief case study to quantify the runtime overhead of our algorithm and to show that online critical path profiling can
Synthetic-Perturbation Techniques For Screening Shared Memory Programs
- SOFTWARE---PRACTICE AND EXPERIENCE
, 1994
"... this paper is to explain the general approach and to extend it to address specific features that are the main source of poor performance on the shared memory programming model. These include performance degradation due to load imbalance and insufficient parallelism, and overhead introduced by synchr ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
this paper is to explain the general approach and to extend it to address specific features that are the main source of poor performance on the shared memory programming model. These include performance degradation due to load imbalance and insufficient parallelism, and overhead introduced by synchronizations and by accessing shared data structures. We illustrate the practicality of SPS by demonstrating its use on two very different case studies: a large image understanding benchmark and a parallel quicksort
PET: A Parallel Performance Estimation Tool
- In Proceedings Seventh SIAM Conference on Parallel Processing for Scientific Computing
, 1995
"... In this paper, we present a functional description of the Performance Estimation Tool (PET), which estimates the performance of parallel scientific applications portrayed in a simple language called Portrayal Specification Language (PSL). With relatively small effort, one can write PSL programs to e ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
In this paper, we present a functional description of the Performance Estimation Tool (PET), which estimates the performance of parallel scientific applications portrayed in a simple language called Portrayal Specification Language (PSL). With relatively small effort, one can write PSL programs to express the execution rate determining segments, the control flow, and data-flow of an application as well as the parameters of a parallel execution environment. From PSL programs, PET estimates performance of applications written for existing and future architectures. We present some preliminary results that compare the performance predicted by PET with the timing results observed on IBM SP1. 1 Introduction Performance analysis and estimation play a central role in the design and development of parallel application software. In parallel environments, the parameter space that affects program performance is much larger than that in the sequential case and hence both performance analysis and p...
Multi-Application Support in a Parallel Program Performance Tool
, 1993
"... Program performance measurement tools have proven to be useful for tuning single, isolated, parallel and distributed applications. However, large-scale parallel machines and heterogeneous networks often do not allow for such isolated execution, much less isolated measurement. Performance measurement ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Program performance measurement tools have proven to be useful for tuning single, isolated, parallel and distributed applications. However, large-scale parallel machines and heterogeneous networks often do not allow for such isolated execution, much less isolated measurement. Performance measurement tools should allow users to study workload scheduling policies, resource competition among application programs, client/server interactions in distributed systems, and comparisons of application programs running on multiple hardware platforms. To enable and encourage such studies, we have extended the IPS-2 parallel program measurement tools to support the analysis of multiple applications (and multiple runs of the same application) in a single measurement session. This multi-application support allows the user to study each application as a logically separate entity, study groupings of the applications based on their physical location, or study the entire collection of applications. We use...

