Results 1 - 10
of
20
Scalable parallel trace-based performance analysis
- In Proc. 13th European PVM/MPI Conference
, 2006
"... Abstract. Automatic trace analysis is an effective method for identifying complex performance phenomena in parallel applications. However, as the size of parallel systems and the number of processors used by individual applications is continuously raised, the traditional approach of analyzing a sing ..."
Abstract
-
Cited by 38 (22 self)
- Add to MetaCart
Abstract. Automatic trace analysis is an effective method for identifying complex performance phenomena in parallel applications. However, as the size of parallel systems and the number of processors used by individual applications is continuously raised, the traditional approach of analyzing a single global trace file, as done by KOJAK’s EXPERT trace analyzer, becomes increasingly constrained by the large number of events. In this article, we present a scalable version of the EXPERT analysis based on analyzing separate local trace files with a parallel tool which ‘replays ’ the target application’s communication behavior. We describe the new parallel analyzer architecture and discuss first empirical results. 1
An Algebra for Cross-Experiment Performance Analysis
- In Proc. of the International Conference on Parallel Processing (ICPP
, 2004
"... Performance tuning of parallel applications usually involves multiple experiments to compare the effects of different optimization strategies. This article describes an algebra that can be used to compare, integrate, and summarize performance data from multiple sources. The algebra consists of a dat ..."
Abstract
-
Cited by 35 (19 self)
- Add to MetaCart
Performance tuning of parallel applications usually involves multiple experiments to compare the effects of different optimization strategies. This article describes an algebra that can be used to compare, integrate, and summarize performance data from multiple sources. The algebra consists of a data model to represent the data in a platformindependent fashion plus arithmetic operations to merge, subtract, and average the data from different experiments. A distinctive feature of this approach is its closure property, which allows processing and viewing all instances of the data model in the same way- regardless of whether they represent original or derived data- in addition to an arbitrary and easy composition of operations.
Efficient Pattern Search in Large Traces through Successive Refinement
- In Proc. of the European Conference on Parallel Computing (EuroPar
, 2004
"... Abstract. Event tracing is a well-accepted technique for post-mortem performance analysis of parallel applications. The expert tool supports the analysis of large traces by automatically searching them for execution patterns that indicate inefficient behavior. However, the current search algorithm w ..."
Abstract
-
Cited by 17 (11 self)
- Add to MetaCart
Abstract. Event tracing is a well-accepted technique for post-mortem performance analysis of parallel applications. The expert tool supports the analysis of large traces by automatically searching them for execution patterns that indicate inefficient behavior. However, the current search algorithm works with independent pattern specifications and ignores the specialization hierarchy existing between them, resulting in a long analysis time caused by repeated matching attempts as well as in replicated code. This article describes an optimized design taking advantage of specialization relationships and leading to a significant runtime improvement as well as to more compact pattern specifications. 1
Modeling and Detecting Performance Problems for Distributed and Parallel Programs with JavaPSL
- In Proc. of the Conference on Supercomputers (SC2001
, 2002
"... In this paper we present JavaPSL, a Performance Specification Language that can be used for a systematic and portable specification of large classes of experiment-related data and performance properties for distributed and parallel programs. Performance properties are described in a generic and norm ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
In this paper we present JavaPSL, a Performance Specification Language that can be used for a systematic and portable specification of large classes of experiment-related data and performance properties for distributed and parallel programs. Performance properties are described in a generic and normalized way, thus interpretation and comparison of performance properties is largely alleviated. Moreover, JavaPSL provides meta-properties in order to describe new properties based on existing ones and to relate properties to each other.
Formalizing OpenMP Performance Properties with ASL
, 1999
"... Performance analysis is an important step in tuning performance critical applications. It is a cyclic process of measuring and analyzing performance data which is driven by the programmers hypotheses on potential performance problems. Currently this process is controlled manually by the programmer. ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
Performance analysis is an important step in tuning performance critical applications. It is a cyclic process of measuring and analyzing performance data which is driven by the programmers hypotheses on potential performance problems. Currently this process is controlled manually by the programmer. We believe that the implicit knowledge applied in this cyclic process should be formalized in order to provide automatic performance analysis for a wider class of programming paradigms and target architectures. This article describes the performance property specification language (ASL) developed in the APART Esprit IV working group which allows specifying performance-related data by an object-oriented model and performance properties by functions and constraints defined over performance-related data. Performance problems and bottlenecks can then be identified based on user- or tool-defined thresholds. In order to demonstrate the usefulness of ASL we apply it to OpenMP by successfully formal...
Distributed application monitoring for clustered SMP architectures
- PROCEEDINGS OF THE 9TH INTERNATIONAL EURO-PAR CONFERENCE ON PARALLEL PROCESSING
, 2003
"... Performance analysis for terascale computing requires a combination of new concepts including distribution, on-line processing and automation. As a foundation for tools realizing these concepts, we present a distributed monitoring approach for clustered SMP architectures that tries to minimize the p ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Performance analysis for terascale computing requires a combination of new concepts including distribution, on-line processing and automation. As a foundation for tools realizing these concepts, we present a distributed monitoring approach for clustered SMP architectures that tries to minimize the perturbation of the target application while retaining flexibility with respect to filtering and processing of performance data. We achieve this goal by dividing the monitor in a passive monitoring library linked to the application and an active component called runtime information producer (RIP) that provides performance data (metric- and event based) for individual nodes. Instead of adding an additional layer in the monitoring system that integrates performance data form the individual RIPs we include a directory service as a third component in our approach. Querying this directory service, tools discover which RIPs provide the data they need.
Gerndt: The EP-Cache Automatic Monitoring System
- International Conference on Parallel and Distributed Systems (PDCS
, 2005
"... In this paper we present an automatic monitoring system consisting of a monitoring infrastructure and an automatic performance analyzer. The monitoring infrastructure supports different monitoring resources (CPU counters, simulation) and monitors the utilization of cache hierarchies in serial and Op ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
In this paper we present an automatic monitoring system consisting of a monitoring infrastructure and an automatic performance analyzer. The monitoring infrastructure supports different monitoring resources (CPU counters, simulation) and monitors the utilization of cache hierarchies in serial and OpenMP programs. A special feature of our system is the restriction of monitoring to single data structures. Our ASL[2]-based automatic analyzer called AMEBA is able to search for predefined performance bottlenecks in code regions using a provided set of search and refinement strategies.
Experiment Management Support for Parallel Performance Tuning
- UNIVERSITY OF WISCONSIN - MADISON
, 1999
"... ..."
Performance Analysis, Data Sharing and Tools Integration in Grids: New Approach based on Ontology
- In Proceedings of International Conference on Computational Science (ICCS 2004), LNCS 3038
, 2004
"... In this paper, we propose a new approach to performance analysis, data sharing and tools integration in Grids that is based on ontology. We devise a novel ontology for describing the semantics of monitoring and performance data that can be used by performance monitoring and measurement tools. We ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this paper, we propose a new approach to performance analysis, data sharing and tools integration in Grids that is based on ontology. We devise a novel ontology for describing the semantics of monitoring and performance data that can be used by performance monitoring and measurement tools. We introduce an architecture for an ontology-based model for performance analysis, data sharing and tools integration. At the core of this architecture is a Grid service which offers facilities for other services to archive and access ontology models along with collected performance data, and to conduct searches and perform reasoning on that data. Using an approach based on ontology, performance data will be easily shared and processed by automated tools, services and human users, thus helping to leverage the data sharing and tools integration, and increasing the degree of automation of performance analysis.
Larsson-Traeff: Evaluating OpenMP Performance Analysis Tools with the APART
- Test Suite, Fifth European Workshop on OpenMP (EWOMP ’03), RWTH Aachen
, 2003
"... Abstract. This paper outlines the design of ATS (the APART Test Suite) for evaluating (automatic) performance analysis tools with respect to their effectiveness in detecting actual performance problems, with focus on the ATS test programs related to OpenMP. It reports on results from applying two Op ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. This paper outlines the design of ATS (the APART Test Suite) for evaluating (automatic) performance analysis tools with respect to their effectiveness in detecting actual performance problems, with focus on the ATS test programs related to OpenMP. It reports on results from applying two OpenMP performance analysis tools to the test cases generated from ATS. 1

