Results 1 - 10
of
22
Design and Implementation of a Parallel Performance Data Management Framework
- IN: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL COMPUTING
, 2005
"... Empirical performance evaluation of parallel systems and applications can generate significant amounts of performance data and analysis results from multiple experiments as performance is investigated and problems diagnosed. Hence, the management of performance information is a core component of per ..."
Abstract
-
Cited by 22 (13 self)
- Add to MetaCart
Empirical performance evaluation of parallel systems and applications can generate significant amounts of performance data and analysis results from multiple experiments as performance is investigated and problems diagnosed. Hence, the management of performance information is a core component of performance analysis tools. To better support tool integration, portability, and reuse, there is a strong motivation to develop performance data management technology that can provide a common foundation for performance data storage, access, merging, and analysis. This paper presents the design and implementation of the Performance Data Management Framework (PerfDMF). PerfDMF addresses objectives of performance tool integration, interoperation, and reuse by providing common data storage, access, and analysis infrastructure for parallel performance profiles. PerfDMF includes an extensible parallel profile data schema and relational database schema, a profile query and analysis programming interface, and an extendible toolkit for profile import/export and standard analysis. We describe the PerfDMF objectives and architecture, give detailed explanation of the major components, and show examples of PerfDMF application.
PerfExplorer: A Performance Data Mining Framework for Large-Scale Parallel Computing
- In Proceedings of SC 2005 conference, ACM
, 2005
"... Parallel applications running on high-end computer systems manifest a complexity of performance phenomena. Tools to observe parallel performance attempt to capture these phenomena in measurement datasets rich with information relating multiple performance metrics to execution dynamics and parameters ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
Parallel applications running on high-end computer systems manifest a complexity of performance phenomena. Tools to observe parallel performance attempt to capture these phenomena in measurement datasets rich with information relating multiple performance metrics to execution dynamics and parameters specific to the application-system experiment. However, the potential size of datasets and the need to assimilate results from multiple experiments makes it a daunting challenge to not only process the information, but discover and understand performance insights. In this paper, we present PerfExplorer, a framework for parallel performance data mining and knowledge discovery. The framework architecture enables the development and integration of data mining operations that will be applied to large-scale parallel performance profiles. PerfExplorer operates as a client-server system and is built on a robust parallel performance database (PerfDMF) to access the parallel profiles and save its analysis results. Examples are given demonstrating these techniques for performance analysis of ASCI applications. 1.
Characterization of Computational Grid Resources Using Low-level Benchmarks
- In: Second IEEE International Conference on e-Science and Grid Computing (e-Science’06
, 2004
"... An important factor that needs to be taken into account by end-users and systems (schedulers, resource brokers, policy brokers) when mapping applications to the Grid, is the performance capacity of hardware resources attached to the Grid and made available through its Virtual Organizations (VOs). In ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
An important factor that needs to be taken into account by end-users and systems (schedulers, resource brokers, policy brokers) when mapping applications to the Grid, is the performance capacity of hardware resources attached to the Grid and made available through its Virtual Organizations (VOs). In this paper, we examine the problem of characterizing the performance capacity of Grid resources using benchmarking. We examine the conditions under which such characterization experiments can be implemented in a Grid setting and present the challenges that arise in this context. We specify a small number of performance metrics and propose a suite of micro-benchmarks to estimate these metrics for clusters that belong to large Virtual Organizations. We describe GridBench, a tool developed to administer benchmarking experiments, publish their results, and produce graphical representations of their metrics. We describe benchmarking experiments conducted with, and published through GridBench, and show how they can help end-users assess the performance capacity of resources that belong to a target Virtual Organization. Finally, we examine the advantages of this approach over solutions implemented currently in existing Grid infrastructures. We conclude that it is essential to provide benchmarking services in the Grid infrastructure, in order to enable the attachment of performance-related metadata to resources belonging to Virtual Organizations and the retrieval of such metadata by end-users and other Grid systems. 1
Performance Tool Support for MPI-2 on Linux
, 2004
"... Programmers of message-passing codes for clusters of workstations face a daunting challenge in understanding the performance bottlenecks of their applications. This is largely due to the vast amount of performance data that is collected, and the time and expertise necessary to use traditional parall ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Programmers of message-passing codes for clusters of workstations face a daunting challenge in understanding the performance bottlenecks of their applications. This is largely due to the vast amount of performance data that is collected, and the time and expertise necessary to use traditional parallel performance tools to analyze that data. This paper reports on our recent efforts developing a performance tool for MPI applications on Linux clusters. Our target MPI implementations were LAM/MPI and MPICH2, both of which support portions of the MPI-2 Standard. We started with an existing performance tool and added support for non-shared file systems, MPI-2 one-sided communications, dynamic process creation, and MPI Object naming. We present results usingthe enhanced version of the tool to examine the performance of several applications. We describe a new performance tool benchmark suite we have developed, PPerfMark, and present results for the benchmark using the enhanced tool.
Integrating Database Technology with Comparison-based Parallel Performance Diagnosis: The PerfTrack Performance Experiment Management Tool
- PROCEEDINGS OF SC’05, NOV. 2005
, 2005
"... PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data collected in different locations and formats can be compared and viewed in a single performance analysis session. The underlying data store used in PerfTrack is implemented with a dat ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data collected in different locations and formats can be compared and viewed in a single performance analysis session. The underlying data store used in PerfTrack is implemented with a database management system (DBMS). PerfTrack includes interfaces to the data store and scripts for automatically collecting data describing each experiment, such as build and platform details. We have implemented a prototype of PerfTrack that can use Oracle or PostgreSQL for the data store. We demonstrate the prototype's functionality with three case studies: one is a comparative study of an ASC purple benchmark on high-end Linux and AIX platforms; the second is a parameter study conducted at Lawrence Livermore National Laboratory (LLNL) on two high end platforms, a 128 node cluster of IBM Power 4 processors and BlueGene/L; the third demonstrates incorporating performance data from the Paradyn Parallel Performance Tool into an existing PerfTrack data store.
A Case Study Using Automatic Performance Tuning for Large-Scale Scientific Programs,” manuscript
- Proceedings of the International Symposium on High Performance Distributed Computing (HPDC
, 2006
"... Abstract — Active Harmony is an automated runtime performance tuning system. In this paper we describe several case studies of using Active Harmony to improve the performance for scientific libraries and applications. We improved the tuning mechanism so it can work iteratively with benchmarking runs ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract — Active Harmony is an automated runtime performance tuning system. In this paper we describe several case studies of using Active Harmony to improve the performance for scientific libraries and applications. We improved the tuning mechanism so it can work iteratively with benchmarking runs. By tuning the computation and data distribution, Active Harmony helps applications that utilize the PETSc library to achieve better load balance and to reduce the execution time up to 18%. For the climate simulation application POP using 480 processors, the tuning results show that by changing the block size and parameter values, the execution time is reduced up to 16.7%. Active Harmony is able to improve GS2, a plasma physics code, up to a factor of 5.1 times faster. The experiment results show that the Active Harmony system is a feasible and useful tool to automated performance tuning for scientific libraries and applications. I.
Decreasing end-to-end job execution times by increasing resource utilization using predictive scheduling in the Grid
, 2005
"... The Grid has the potential to grow significantly over the course of the next decade and therefore the mechanisms that make the Grid possible need to become more efficient in order for the Grid to scale. One of these mechanisms revolves around resource management; ultimately, there will be so many re ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
The Grid has the potential to grow significantly over the course of the next decade and therefore the mechanisms that make the Grid possible need to become more efficient in order for the Grid to scale. One of these mechanisms revolves around resource management; ultimately, there will be so many resources in the Grid, that if they are not managed properly, only a very small fraction of those resources will be utilized. While good resource utilization is very important, it is also a hard problem due to widely distributed dynamic environments normally found in the Grid. It is important to develop an experimental methodology for automatically characterizing grid software in a manner that allows accurate evaluation of the software’s behavior and performance before deployment in order to make better informed resource management decisions. Many Grid services and software are designed and characterized today largely based on the designer’s intuition and on ad hoc experimentation; having the capability to automatically map complex, multi-dimensional requirements and performance data among resource providers and consumers is a necessary step to ensure consistent good resource utilization in the Grid. This automatic matching between the software characterization and a set of raw or logical resources is a much needed functionality that is currently lacking in today’s Grid resource management infrastructure. Ultimately, my proposed work, which addresses performance modeling with the goal to improve resource management, could ensure that the efficiency of
2006 The web-based Prophesy automated performance modeling system The
- IASTED Int’l Conf. on Web Technologies, Applications and Services (WTAS2006
"... Prophesy system is a performance analysis and modeling infrastructure that allows users to record many different parameters relevant to an application’s performance. A key component of Prophesy system is the web-based automated performance modeling system, which allows a developer to quickly gain in ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Prophesy system is a performance analysis and modeling infrastructure that allows users to record many different parameters relevant to an application’s performance. A key component of Prophesy system is the web-based automated performance modeling system, which allows a developer to quickly gain insight into the performance of an application code or functions within a code such that the developer can cut down on the time required for developing efficient applications, locate potential bottlenecks, and approximate the runtime on different systems. In this paper, we present the design and implementation of the web-based automated performance modeling system, and use parallel matrix-matrix multiplication and NAS parallel benchmark BT as examples to illustrate how to automatically model these parallel applications using online web-based interfaces.
Using performance prediction to allocate grid resources
, 2004
"... Large-scale applications often require computational grids to obtain the needed compute power for execution. Generally, users are given access to a collection of resources that can be used for execution. The collection of resources can be dynamic, and the users must decide which collection of hetero ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Large-scale applications often require computational grids to obtain the needed compute power for execution. Generally, users are given access to a collection of resources that can be used for execution. The collection of resources can be dynamic, and the users must decide which collection of heterogeneous and distributed resources to use. However, many users often do not have knowledge of all of the resource performance characteristics on which to make informed decisions and therefore need automated tools to perform the mapping of jobs to the available resources. In this paper, we present a resource planner system that uses performance prediction, based upon historical data, to identify the appropriate resources. This system is used with a gravitational-wave physics experiment, LIGO, for which the initial results indicate an average of 24 % reduction in execution time using the performance prediction versus a random selection of resources. 1.
Performance Prediction-based versus Load-based Site Selection: Quantifying the Difference
- in Proc. of the 18th International Conference on Parallel and Distributed Computing Systems, 2005
"... Distributed systems are available and provide vast compute and data resources to users. With the availability of multiple resources, one of the major issues to be addressed is site selection. Users have access to many resource sites from which to select for execution of applications. In this paper, ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Distributed systems are available and provide vast compute and data resources to users. With the availability of multiple resources, one of the major issues to be addressed is site selection. Users have access to many resource sites from which to select for execution of applications. In this paper, we quantify the advantages of using performance prediction to select sites as compared to using load information, which is the widely used method. The quantification is based upon two case studies. The first case study involves a large-scale scientific application, called GEO LIGO, for which the experimental results indicate an average of 33% performance improvement as compared to a load-based method. The second case study involves a web-based, educational application, called AADMLSS, for which the results indicate an average of 10 % performance improvement as compared to a load-based method.

