Results 1 - 10
of
15
Integrating Database Technology with Comparison-based Parallel Performance Diagnosis: The PerfTrack Performance Experiment Management Tool
- PROCEEDINGS OF SC’05, NOV. 2005
, 2005
"... PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data collected in different locations and formats can be compared and viewed in a single performance analysis session. The underlying data store used in PerfTrack is implemented with a dat ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data collected in different locations and formats can be compared and viewed in a single performance analysis session. The underlying data store used in PerfTrack is implemented with a database management system (DBMS). PerfTrack includes interfaces to the data store and scripts for automatically collecting data describing each experiment, such as build and platform details. We have implemented a prototype of PerfTrack that can use Oracle or PostgreSQL for the data store. We demonstrate the prototype's functionality with three case studies: one is a comparative study of an ASC purple benchmark on high-end Linux and AIX platforms; the second is a parameter study conducted at Lawrence Livermore National Laboratory (LLNL) on two high end platforms, a 128 node cluster of IBM Power 4 processors and BlueGene/L; the third demonstrates incorporating performance data from the Paradyn Parallel Performance Tool into an existing PerfTrack data store.
ZEN: A Directive-based Language for Automatic Experiment Management of Distributed and Parallel Programs
, 2002
"... Performance-oriented code development, software testing, performance analysis and parameter studies for distributed and parallel systems commonly require to conduct a large number of executions. Every execution of an application can be viewed as a scientific experiment. So far there exists very litt ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Performance-oriented code development, software testing, performance analysis and parameter studies for distributed and parallel systems commonly require to conduct a large number of executions. Every execution of an application can be viewed as a scientific experiment. So far there exists very little support to specify and to control execution of a large number of experiments. Various problems must be addressed, such as which input files to read, where to store program's output, what performance metrics to measure and what range of problem parameters to observe. This paper describes ZEN, a directivebased language to support automatic experiment management for a wide variety of parallel and distributed architectures. It is used to specify arbitrarily complex program executions in the context of performance analysis and tuning, parameter studies, and software testing. ZEN introduces directives to substitute strings and insert assignment statements inside arbitrary files, such as program, input, script, or makefiles. This enables the programmer to invoke experiments for arbitrary value ranges of any problem parameter, including program variables, file names, compiler options, target machines, machine sizes, scheduling strategies, data distributions, etc. The number of experiments can be controlled through ZEN constraint directives. Finally, the programmer may request a large set of performance metrics to be computed for any code region of interest. The scope of ZEN directives can be restricted to arbitrary file or code regions.
On Using ZENTURIO for Performance and Parameter Studies on Cluster and Grid Architectures
- Proc. 11 th Euromicro Conference on Parallel Distributed Network-based Processing (PDP2003), Genua
, 2003
"... Over the last decade, a dramatic increase has been observed in the need for generating and organising data in the course of large parameter studies, performance analysis, and software testings. We have developed the ZENTURIO experiment management tool for performance and parameter studies on cluster ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Over the last decade, a dramatic increase has been observed in the need for generating and organising data in the course of large parameter studies, performance analysis, and software testings. We have developed the ZENTURIO experiment management tool for performance and parameter studies on cluster and Grid architectures. In this paper we describe our experience with ZENTURIO for performance and parameter studies of a material science kernel, a three-dimensional particle-in-cell simulation, a fast fourier transform, and a financial modeling application. Experiments have been conducted on an SMP cluster with Fast Ethernet and Myrinet communication networks, using PBS (Portable Batch System) and GRAM (Globus Resource Allocation Manager) as job managers.
Toward an Experiment Engine for Lightweight Grids
"... This paper presents a case study conducted on the Grid’5000 platform, a lightweight grid. The goal was to make a rather simple experiment, and study how difficult it was to carry out correctly. This means it had to be correct, reproducible and efficient. The paper shows that despite the precautions ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
This paper presents a case study conducted on the Grid’5000 platform, a lightweight grid. The goal was to make a rather simple experiment, and study how difficult it was to carry out correctly. This means it had to be correct, reproducible and efficient. The paper shows that despite the precautions taken, many parameters that could have an effect on the result were at first overlooked. It also shows that benchmarking plays a key role on making an experiment correct and reproducible. The process is in the end extremely tedious, and stresses the need for new tools to help users. The contribution of this work is to present a methodology to get correct results on grid architecture, to identify relevant problems and to propose an infrastructure that answers part of the problems encountered during experiments. Additionally, pieces of this infrastructure have been built and are also presented.
A Web Service-based Experiment Management System for the Grids
- In 17th International Parallel and Distributed Processing Symposium (IPDPS 2003
, 2002
"... We have developed ZENTURIO, which is an experiment management system for performance and parameter studies as well as software testing for cluster and Grid architectures. In this paper we describe our experience with developing ZENTURIO as a collection of Web services. A directivebased language call ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We have developed ZENTURIO, which is an experiment management system for performance and parameter studies as well as software testing for cluster and Grid architectures. In this paper we describe our experience with developing ZENTURIO as a collection of Web services. A directivebased language called ZEN is used to annotate arbitrary les and specify arbitrary application parameters. An Experiment Generator Web service parses annotated application les and generates appropriate codes for experiments. An Experiment Executor Web service compiles, executes, and monitors experiments on a single or a set of local machines on the Grid. Factory and Registry services are employed to create and register Web services, respectively. An event infrastructure has been customised to support high-level events under ZENTURIO in order to avoid expensive polling and to detect important system and application status information. A graphical user portal allows the user to generate, control, and monitor experiments. We compare our design with the Open Grid Service Architecture (OGSA) and highlight similarities and dierences. We report results of using ZENTURIO to conduct performance analysis of a material science code that executes on the Grid under the Globus Grid infrastructure.
PerfTrack: Scalable Application Performance Diagnosis for Linux Clusters
"... Abstract. PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data collected in different locations and formats can be compared and viewed in a single performance analysis session. The underlying data store used in PerfTrack is implemented wi ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract. PerfTrack is a data store and interface for managing performance data from large-scale parallel applications. Data collected in different locations and formats can be compared and viewed in a single performance analysis session. The underlying data store used in PerfTrack is implemented with a database management system (DBMS). PerfTrack includes interfaces to the data store and scripts for automatically collecting data describing each experiment, such as build and platform details. In this paper, describe recent work to extend PerfTrack to automatically collect machine description data, to perform aggregation of measured values and executions, and to conduct richer forms of data exploration. 1
Using business workflows to improve quality of experiments in distributed systems research
- in "SC12 - SuperComputing 2012 (poster session)", Salt Lake City, United States
, 2012
"... Abstract—Distributed systems pose many difficult problems to researchers. Due to their large-scale complexity, their numerous constituents (e.g., computing nodes, network links) tend to fail in unpredictable ways. This particular fragility of experiment execution threatens reproducibility, often con ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Distributed systems pose many difficult problems to researchers. Due to their large-scale complexity, their numerous constituents (e.g., computing nodes, network links) tend to fail in unpredictable ways. This particular fragility of experiment execution threatens reproducibility, often considered to be a foundation of experimental science. Our poster presents a new approach to description and execution of experiments involving large-scale computer installations. The main idea consists in describing the experiment as workflow and using achievements of Business Workflow Management to reliably and efficiently execute it. Moreover, to facilitate the design process, the framework provides abstractions that hide unnecessary complexity from the user. I. PROBLEM The research in distributed systems is impeded by many
Initial Design of a Test Suite for Automatic Performance Analysis Tools
, 2002
"... Automatic performance tools must of course be tested as to whether they perform their task correctly. Because performance tools are meta-programs, tool testing is more complex than ordinary program testing, and comprises at least three aspects. First, it must be ensured that the tools do neither alt ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Automatic performance tools must of course be tested as to whether they perform their task correctly. Because performance tools are meta-programs, tool testing is more complex than ordinary program testing, and comprises at least three aspects. First, it must be ensured that the tools do neither alter the semantics nor distort the runtime behavior of the application under investigation. Next, it must be verified that the tools collect the correct performance data as required by their specification. Finally, it must be checked that the tools indeed perform their intended tasks: For badly performing applications, relevant performance problems must be detected and reported, and, on the other hand, tools should not diagnose performance problems for well-tuned programs without such problems. In short, performance tools should be semantics-preserving, complete and correct. Focusing on the correctness aspect, testing can be done by using synthetic test functions with controllable performance properties, and/or real world applications with known performance behavior. A systematic test suite can be built from such functions and other components, possibly with the help of tools to assist the user in putting the pieces together into executable test programs. Clearly,
An Automated Approach to Create, Store, and Analyze Large-scale Experimental Data in Clouds
"... Abstract—The flexibility and scalability of computing clouds make them an attractive application migration target; yet, the cloud remains a black-box for the most part. In particular, their opacity impedes the efficient but necessary testing and tuning prior to moving new applications into the cloud ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—The flexibility and scalability of computing clouds make them an attractive application migration target; yet, the cloud remains a black-box for the most part. In particular, their opacity impedes the efficient but necessary testing and tuning prior to moving new applications into the cloud. A natural and presumably unbiased approach to reveal the cloud’s complexity is to collect significant performance data by conduct-ing more experimental studies. However, conducting large-scale system experiments is particularly challenging because of the practical difficulties that arise during experimental deployment, configuration, execution and data processing. In this paper we address some of these challenges through Expertus – a flexible automation framework we have developed to create, store and analyze large-scale experimental measurement data. We create performance data by automating the measurement processes for large-scale experimentation, including: the application de-ployment, configuration, workload execution and data collection processes. We have automated the processing of heterogeneous data as well as the storage of it in a data warehouse, which we have specifically designed for housing measurement data. Finally, we have developed a rich web portal to navigate, statistically analyze and visualize the collected data. Expertus combines template-driven code generation techniques with aspect-oriented programming concepts to generate the necessary resources to fully automate the experiment measurement process. In Expertus, a researcher provides only the high-level description about the experiment, and the framework does everything else. At the end, the researcher can graphically navigate and process the data in the web portal.