Results 1 - 10
of
30
The Tau Parallel Performance System
- The International Journal of High Performance Computing Applications
, 2006
"... The ability of performance technology to keep pace with the growing complexity of parallel and distributed systems depends on robust performance frameworks that can at once provide system-specific performance capabilities and support high-level performance problem solving. Flexibility and portabilit ..."
Abstract
-
Cited by 97 (14 self)
- Add to MetaCart
The ability of performance technology to keep pace with the growing complexity of parallel and distributed systems depends on robust performance frameworks that can at once provide system-specific performance capabilities and support high-level performance problem solving. Flexibility and portability in empirical methods and processes are influenced primarily by the strategies available for instrumentation and measurement, and how effectively they are integrated and composed. This paper presents the TAU (Tuning and Analysis Utilities) parallel performance system and describe how it addresses diverse requirements for performance observation and analysis.
Vertical Profiling: Understanding the Behavior of Object-Oriented Applications
"... Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e# ..."
Abstract
-
Cited by 47 (14 self)
- Add to MetaCart
Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e#ciently. However, the virtualization introduced by this sophistication provides a significant challenge to understanding complete system performance, not found in traditionally compiled languages, such as C or C++. Thus, understanding system performance of such a system requires profiling that spans all levels of the execution stack, such as the hardware, operating system, virtual machine, and application.
An Algebra for Cross-Experiment Performance Analysis
- In Proc. of the International Conference on Parallel Processing (ICPP
, 2004
"... Performance tuning of parallel applications usually involves multiple experiments to compare the effects of different optimization strategies. This article describes an algebra that can be used to compare, integrate, and summarize performance data from multiple sources. The algebra consists of a dat ..."
Abstract
-
Cited by 35 (19 self)
- Add to MetaCart
Performance tuning of parallel applications usually involves multiple experiments to compare the effects of different optimization strategies. This article describes an algebra that can be used to compare, integrate, and summarize performance data from multiple sources. The algebra consists of a data model to represent the data in a platformindependent fashion plus arithmetic operations to merge, subtract, and average the data from different experiments. A distinctive feature of this approach is its closure property, which allows processing and viewing all instances of the data model in the same way- regardless of whether they represent original or derived data- in addition to an arbitrary and easy composition of operations.
Modeling Application Performance by Convolving Machine Signatures with Application Profiles
, 2001
"... This paper presents a performance modeling methodology that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-performance metrics, and is shown to be effective on a class of High Performance Computing benchmarks. The method ..."
Abstract
-
Cited by 34 (5 self)
- Add to MetaCart
This paper presents a performance modeling methodology that is faster than traditional cycle-accurate simulation, more sophisticated than performance estimation based on system peak-performance metrics, and is shown to be effective on a class of High Performance Computing benchmarks. The method yields insight into the factors that affect performance on single-processor and parallel computers.
Using Hardware Performance Monitors to Understand the Behavior of Java Applications
- IN PROC. OF THE THIRD USENIX VIRTUAL MACHINE RESEARCH AND TECHNOLOGY SYMP
, 2004
"... Modern Java programs, such as middleware and application servers, include many complex software components. Improving the performance of these Java applications requires a better understanding of the interactions between the application, virtual machine, operating system, and architecture. Hardware ..."
Abstract
-
Cited by 33 (8 self)
- Add to MetaCart
Modern Java programs, such as middleware and application servers, include many complex software components. Improving the performance of these Java applications requires a better understanding of the interactions between the application, virtual machine, operating system, and architecture. Hardware performance monitors, which are available on most modern processors, provide facilities to obtain detailed performance measurements of long-running applications in real time. However, interpreting the data collected using hardware performance monitors is difficult because of the low-level nature of the data. We have
A Framework for Performance Modeling and Prediction
- IN SC 2002
, 2002
"... Cycle-accurate simulation is far too slow for modeling the expected performance of full parallel applications on large HPC systems. And just running an application on a system and observing wallclock time tells you nothing about why the application performs as it does (and is anyway impossible on ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
Cycle-accurate simulation is far too slow for modeling the expected performance of full parallel applications on large HPC systems. And just running an application on a system and observing wallclock time tells you nothing about why the application performs as it does (and is anyway impossible on yet-to-be-built systems). Here we present a framework for performance modeling and prediction that is faster than cycle-accurate simulation, more informative than simple benchmarking, and is shown useful for performance investigations in several dimensions.
A Framework for Collecting Provenance in Data-Centric Scientific Workflows
- In ICWS
, 2006
"... The increasing ability for the earth sciences to sense the world around us is resulting in a growing need for datadriven applications that are under the control of data-centric workflows composed of grid- and web- services. The focus of our work is on provenance collection for these workflows, neces ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
The increasing ability for the earth sciences to sense the world around us is resulting in a growing need for datadriven applications that are under the control of data-centric workflows composed of grid- and web- services. The focus of our work is on provenance collection for these workflows, necessary to validate the workflow and to determine quality of generated data products. The challenge we address is to record uniform and usable provenance metadata that meets the domain needs while minimizing the modification burden on the service authors and the performance overhead on the workflow engine and the services. The framework, based on a loosely-coupled publish-subscribe architecture for propagating provenance activities, satisfies the needs of detailed provenance collection while a performance evaluation of a prototype finds a minimal performance overhead (in the range of 1 % for an eight service workflow using 271 data products). 1.
Automating Vertical Profiling
, 2005
"... Last year at OOPSLA we presented a methodology, vertical profiling, for understanding the performance of objectoriented programs. The key insight behind this methodology is that modern programs run on top of many layers (virtual machine, middleware, etc) and thus we need to collect and combine infor ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
Last year at OOPSLA we presented a methodology, vertical profiling, for understanding the performance of objectoriented programs. The key insight behind this methodology is that modern programs run on top of many layers (virtual machine, middleware, etc) and thus we need to collect and combine information from all layers in order to understand system performance. Although our methodology was able to explain previously unexplained performance phenomena, it was extremely labor intensive. In this paper we describe and evaluate techniques for automating two significant activities of vertical profiling: trace alignment and correlation. Trace alignment aligns traces obtained from separate runs so that one can reason across the traces. We are not aware of any prior approach that effectively and automatically aligns traces. Correlation sifts through hundreds of metrics to find ones that have a bearing on a performance anomaly of interest. In prior work we found that statistical correlation was only sometimes effective. We have identified highly-effective approaches for both activities. For aligning traces we explore dynamic time warping, and for correlation we explore eight correlators based on statistical correlation, distance measures, and piecewise linear segmentation. Although we explore these activities in the context of vertical profiling, both activities are widely applicable in the performance analysis area.
Performance and environment monitoring for whole-system characterization and optimization
- In Proc. of the 2nd IBM Watson Conference on Interaction between Architecture, Circuits, and Compilers (PAC), Yorktown Heights
, 2004
"... ..."
Active Monitoring In Grid Environments Using Mobile Agent Technology
- In 2nd Workshop on Active Middleware Services (AMS'00) in HPDC-9
, 2000
"... Monitoring distributed computational resources e#ectively is a crucial factor for high-performance distributed computation. Performance analysis and tuning, scheduling strategies, fault detection, are only some of the activities that require monitoring facilities. In this paper we present a mobile a ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Monitoring distributed computational resources e#ectively is a crucial factor for high-performance distributed computation. Performance analysis and tuning, scheduling strategies, fault detection, are only some of the activities that require monitoring facilities. In this paper we present a mobile agent-based monitoring architecture. After explaining the reasons why this technology is adequate to cope with Grid systems' heterogeneity, a description of the basic components of the system designed is provided. We also present some considerations on the high degree of flexibility that can be reached with the proposed approach. Keywords: Grid, Mobile agents, Monitoring, Java, Distributed Management. 1. INTRODUCTION Grid environments have recently emerged as integrating infrastructure for distributed high-performance scientific applications [2]. Several scientific applications of di#erent domains such as high energy physics, earth sciences, biology, require a large amount of computing and...

