Results 1 -
7 of
7
On Utilizing Experiment Data Repository for Performance Analysis of Parallel Applications
- In 9th International Europar Conference(EuroPar 2003), Lecture Notes in Computer Science
, 2003
"... Performance data usually must be archived for various performance analysis and optimization tasks such as multi-experiment analysis, performance comparison, automated performance diagnosis. However, little eort has been done to employ data repositories to organize and store performance data. Thi ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
(Show Context)
Performance data usually must be archived for various performance analysis and optimization tasks such as multi-experiment analysis, performance comparison, automated performance diagnosis. However, little eort has been done to employ data repositories to organize and store performance data. This lack of systematic organization of data has hindered several aspects of performance analysis tools such as performance comparison, performance data sharing and tools integration. In this paper we describe our approach to exploit a relational-based experiment data repository in SCALEA which is a performance instrumentation, measurement, analysis and visualization tool for parallel programs.
Model-based performance diagnosis of master-worker parallel computations
- IN: EUROPAR
, 2006
"... Parallel performance tuning naturally involves a diagnosis process to locate and explain sources of program inefficiency. Proposed is an approach that exploits parallel computation patterns (models) for diagnosis discovery. Knowledge of performance problems and inference rules for hypothesis search ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Parallel performance tuning naturally involves a diagnosis process to locate and explain sources of program inefficiency. Proposed is an approach that exploits parallel computation patterns (models) for diagnosis discovery. Knowledge of performance problems and inference rules for hypothesis search are engineered from model semantics and analysis expertise. In this manner, the performance diagnosis process can be automated as well as adapted for parallel model variations. We demonstrate the implementation of model-based performance diagnosis on the classic Master-Worker pattern. Our results suggest that patternbased performance knowledge can provide effective guidance for locating and explaining performance bugs at a high level of program abstraction.
Causal Performance Models of Computer Systems: Definition and Learning Algorithms
"... Causal models are proposed for the representation of relational information of a performance analysis of computer systems. Performance models fulfill many requirements. Causal models offer a formalization and unification of the properties that we expect. Causal structure learning algorithms attempt ..."
Abstract
- Add to MetaCart
(Show Context)
Causal models are proposed for the representation of relational information of a performance analysis of computer systems. Performance models fulfill many requirements. Causal models offer a formalization and unification of the properties that we expect. Causal structure learning algorithms attempt to construct such models from experimental data. Existing algorithms have been extended to incorporate the variety of variable types and relations encountered in real performance data. To handle a combination of continuous and discrete variables with possibly non-linear relations, we use the more general conditional independence test based on the mutual information between probabilistic variables. The underlying probability distribution of experimental data is estimated by kernel density estimation. The estimate is constructed by centering a scaled kernel at each observation. Deterministic relations imply that variables contain equivalent information about other variables and cannot be represented by a faithful model. To handle this, the complexity of the relations is used as a criterion to decide upon which of the equivalent variables directly relates to the target and conditional independency is redefined to reestablish the faithfulness of the graphs. Experiments with sequential and parallel programs show that accurate models are inferred. They provide insight in how each variable affects overall performance measures and the analysis can be used to validate independence assumptions and find potential explanations for outliers. 1.
PerWiz: A What-If Prediction Tool for Tuning Message Passing Programs ⋆
"... Abstract. This paper presents PerWiz, a performance prediction tool for improving the performance of message passing programs. PerWiz focuses on locating where a significant improvement can be achieved. To locate this, PerWiz performs a post-mortem analysis based on a realistic parallel computationa ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. This paper presents PerWiz, a performance prediction tool for improving the performance of message passing programs. PerWiz focuses on locating where a significant improvement can be achieved. To locate this, PerWiz performs a post-mortem analysis based on a realistic parallel computational model, LogGPS, so that predicts what performance will be achieved if the programs are modified according to typical tuning techniques, such as load balancing for a better workload distribution and message scheduling for a shorter waiting time. We also show two case studies where PerWiz played an important role in improving the performance of regular applications. Our results indicate that PerWiz is useful for application developers to assess the potential reduction in execution time that will be derived from program modification. 1
A Framework for Comparative Performance Analysis of MPI Applications
"... Parallel application developers are facing a myriad of parameters when trying to understand the performance behavior of their code. Even within a single hardware configuration, the performance of any application will depend among others on factors such as the MPI library or some application level in ..."
Abstract
- Add to MetaCart
(Show Context)
Parallel application developers are facing a myriad of parameters when trying to understand the performance behavior of their code. Even within a single hardware configuration, the performance of any application will depend among others on factors such as the MPI library or some application level input parameters. This paper deals with the problem on how to determine the cause for performance variations of an application. Based on tracefiles of the application generated for several scenarios and an according documentation of the parameters used for each run, the PERDAC performance analysis tool is calculating statistical properties of the performance data gathered, such as average, standard deviation, maximum and minimum across the different runs. In the current implementation, the performance data comprises of the total execution time of an MPI function on a process, the number of occurrences of each MPI function and optionally some hardware performance counters such as cache hits or cache misses. In a second step, the results of the statistical analysis are traversed in search for parameters, which show an non-uniform behavior in the analyzed execution scenarios. 1
SUMMARY
"... Performance engineering of parallel and distributed applications is a complex task that iterates through various phases, ranging from modeling and prediction, to performance measurement, experiment management, data collection, and bottleneck analysis. There is no evidence so far that all of these ph ..."
Abstract
- Add to MetaCart
(Show Context)
Performance engineering of parallel and distributed applications is a complex task that iterates through various phases, ranging from modeling and prediction, to performance measurement, experiment management, data collection, and bottleneck analysis. There is no evidence so far that all of these phases should/can be integrated in a single monolithic tool. Moreover, the emergence of computational Grids as a common single wide-area platform for high-performance computing raises the idea to provide performance tools and others as interacting Grid services that share resources, support interoperability among different users and tools, and most important provide omni-present performance functionality over the Grid. We have developed the ASKALON tool set [18] to support performance-oriented development of parallel and distributed (Grid) applications. ASKALON comprises four tools, coherently integrated into a Grid service-based distributed architecture. SCALEA is a performance instrumentation, measurement, and analysis tool of parallel and distributed applications. ZENTURIO is a general purpose experiment management tool with advanced support for multi-experiment performance analysis and parameter studies. AKSUM provides semi-automatic high-level performance bottleneck detection through a special-purpose performance property specification language. The PerformanceProphet enables the user to model and predict the performance of parallel applications at early development stages.
A Decentralized Utility-based Scheduling Algorithm for Grids
"... Abstract. Grid systems have gain tremendous importance in past years since application requirements increased drastically. The heterogeneity and geographic dispersion of grid resources and applications places some difficult challenges such as job scheduling. A scheduling algorithm tries to find a re ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Grid systems have gain tremendous importance in past years since application requirements increased drastically. The heterogeneity and geographic dispersion of grid resources and applications places some difficult challenges such as job scheduling. A scheduling algorithm tries to find a resource for a job that fulfills the job’s requirements while optimizing a given objective function. Utility is a measure of a user’s satisfaction that can be seen as a objective function that a scheduler tries to maximize. Many utility as an objective scheduling algorithms have been proposed. However, the proposed algorithms do not consider partial requirement satisfaction by awarding an utility based on the to-tal fulfillment of the requirement and follow centralized or hierarchical approaches suffer problems concerning scalability and fault tolerance. Our solution proposes a decentralized scheduling architecture with util-ity based scheduling algorithm that considers partial requirements satis-faction to overcome the shortcomings of actual solutions.