Results 1 -
7 of
7
Scalable communication . . .
"... Performance analysis and prediction for parallel applications is important for the design and development of scientific applications, and for the construction and procurement of highperformance computing (HPC) systems. As one of the most important approaches, application tracing is widely used for t ..."
Abstract
- Add to MetaCart
Performance analysis and prediction for parallel applications is important for the design and development of scientific applications, and for the construction and procurement of highperformance computing (HPC) systems. As one of the most important approaches, application tracing is widely used for this purpose for being able to provide the computation and communication details of an application. Recent progress in communication tracing has tremendously improved the scalability of tracing tools and reduced the size of the trace file, and thereby opened up novel opportunities for trace-based performance analysis for parallel applications. This work focuses on domain-specific trace compression methodology and puts forth fundamentally new approaches to improve the communication tracing techniques. Facilitated by the advances in this area, novel algorithms are further designed to address the hard problem of performance analysis, prediction, and benchmarking at scale. Specifically, this work makes the following contributions: 1. This work contributes ScalaExtrap, a fundamentally novel performance modeling scheme and tool. With ScalaExtrap, we synthetically generate the application trace for large
ii BIOGRAPHY
, 2013
"... Analysis. (Under the direction of Frank Mueller.) The next few years have been projected to usher in the wake of the exascale era where systems are expected to be comprised of several million cores. Applications that are scaled to run on these systems can generate extensive amounts of data, and expe ..."
Abstract
- Add to MetaCart
(Show Context)
Analysis. (Under the direction of Frank Mueller.) The next few years have been projected to usher in the wake of the exascale era where systems are expected to be comprised of several million cores. Applications that are scaled to run on these systems can generate extensive amounts of data, and experience with current petascale systems shows that developers are struggling to keep pace with this increase in scale. A large number of problems surface at high scale. Root cause diagnosis of such problems often fails because tools, specifically trace-based ones, cannot afford to record the entire set of metrics they measure owing to the prohibitive cost of instrumentation. We propose to address these tool scalability problems by combining customized tracing and providing support for in-situ data analysis. To this end, we have developed ScalaJack, a framework that supports dynamic customizable instrumentation and pluggable extension capabilities through which a user can instrument the interfaces that are pertinent to the problem at hand and also perform in-situ data analysis at the specific points of execution thus achieving scalable trace sizes. The framework also allows users to eliminate the presence of cross cutting concerns by factoring code into modular aspects thus achieving better maintainability. We evaluate the viability of ScalaJack by demonstrating its ability with several case studies of traditional HPC applications. © Copyright 2013 by Srinath Krishna Ananthakrishnan
Acknowledgements
, 2006
"... This thesis is submitted as the full requirement for the degree "Doctor of Philosophy in Chemical Engineering". To the best of my knowledge, all of the work in this thesis is original, except where due reference is made in the text. It has not (in whole or in part) been previously submitte ..."
Abstract
- Add to MetaCart
(Show Context)
This thesis is submitted as the full requirement for the degree "Doctor of Philosophy in Chemical Engineering". To the best of my knowledge, all of the work in this thesis is original, except where due reference is made in the text. It has not (in whole or in part) been previously submitted to any tertiary institution as part of a degree.
AScalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology
"... Ever-growing performance of supercomputers nowadays brings demanding requirements of energy efficiency and resilience, due to rapidly expanding size and duration in use of the large-scale computing systems. Many application/architecture-dependent parameters that determine energy efficiency and resil ..."
Abstract
- Add to MetaCart
(Show Context)
Ever-growing performance of supercomputers nowadays brings demanding requirements of energy efficiency and resilience, due to rapidly expanding size and duration in use of the large-scale computing systems. Many application/architecture-dependent parameters that determine energy efficiency and resilience individually have causal effects with each other, which directly affect the trade-offs among performance, energy efficiency and resilience at scale. To enable high-efficiency management for large-scale High Performance Computing (HPC) systems nowadays, quantitatively understanding the entangled effects among performance, energy efficiency, and resilience is thus required. While previous work focuses on exploring energy saving and re-silience enhancing opportunities separately, little has been done to theoretically and empirically investigate the interplay between energy efficiency and resilience at scale. In this paper, by extending the Amdahl’s Law and the Karp-Flatt Metric, taking resilience into consideration, we quantitatively model the integrated energy efficiency in terms of performance per Watt, and showcase the trade-offs among typical HPC param-eters, such as number of cores, frequency/voltage, and failure rates. Experimental results for a wide spec-trum of HPC benchmarks on two HPC systems show that the proposed models are accurate in extrapolating resilience-aware performance and energy efficiency, and capable of capturing the interplay among various energy saving and resilience factors. Moreover, the models can help find the optimal HPC configuration for the highest integrated energy efficiency, in the presence of failures and applied resilience techniques. CCS Concepts: rComputer systems organization → Distributed architectures; rHardware → Power estimation and optimization; Fault tolerance;
APPROVED BY:
, 2014
"... LAGADAPATI, MAHESH. Benchmark Generation and Simulation at Extreme Scale. (Under the direction of Frank Mueller.) The coming years are projected to usher in the era of exascale high performance computing (HPC) systems. The architecture of a HPC system at this scale is determined by many factors: per ..."
Abstract
- Add to MetaCart
(Show Context)
LAGADAPATI, MAHESH. Benchmark Generation and Simulation at Extreme Scale. (Under the direction of Frank Mueller.) The coming years are projected to usher in the era of exascale high performance computing (HPC) systems. The architecture of a HPC system at this scale is determined by many factors: performance, power consumption, fault tolerance, data transfer rate, etc. Characterizing and tuning the performance of existing parallel applications for a given architectural choice is an important facet in exploiting exascale capabilities and requires hardware/software co-design. Simulations using models of future HPC systems and communication traces from applications running on existing HPC systems can offer an insight into the performance of future architec-tures. This work targets technology developed for scalable application tracing of communication events. It focuses on extreme-scale simulation of HPC applications and their communication behavior via lightweight parallel discrete event simulation for performance estimation and evalu-ation. Instead of simply replaying a trace within a simulator, this work promotes the generation of a benchmark from traces. This benchmark is subsequently exposed to simulation using mod-els to reflect the performance characteristics of future-generation HPC systems. This technique provides a number of benefits, such as eliminating the data intensive trace replay and enabling simulations at different scales. The presented work features novel software co-design aspects, combining the ScalaTrace II tool to generate scalable trace files, the ScalaBenchGen II tool to generate the benchmark, and the xSim tool to asses the benchmark characteristics within a simulator.
ScalaJack: Customized Scalable Tracing with in-situ Data Analysis?
"... Abstract. Root cause diagnosis of large-scale HPC applications often fails because tools, specifically trace-based ones, can no longer record all metrics they measure. We address this problems by combining cus-tomized tracing and providing support for in-situ data analysis via Scala-Jack, a framewor ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Root cause diagnosis of large-scale HPC applications often fails because tools, specifically trace-based ones, can no longer record all metrics they measure. We address this problems by combining cus-tomized tracing and providing support for in-situ data analysis via Scala-Jack, a framework with customizable instrumentation and pluggable ex-tension capabilities for problem directed instrumentation and in-situ data analysis. We further eliminate cross cutting concerns by code refactor-ing for aspect orientation and evaluate these capabilities in case studies within and beyond the scope of tracing. 1
Tools for Simulation and Benchmark Generation at Exascale
"... Abstract. The path to exascale high-performance computing (HPC) poses several challenges related to power, performance, resilience, productivity, programmability, data movement, and data management. Investigating the performance of parallel applications at scale on future architectures and the perfo ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. The path to exascale high-performance computing (HPC) poses several challenges related to power, performance, resilience, productivity, programmability, data movement, and data management. Investigating the performance of parallel applications at scale on future architectures and the performance impact of different architecture choices is an important component of HPC hardware/software co-design. Simulations using models of future HPC systems and communication traces from applications running on existing HPC systems can offer an insight into the performance of future architectures. This work targets technology developed for scalable application tracing of communication events and memory profiles, but can be extended to other areas, such as I/O, control flow, and data flow. It further focuses on extreme-scale simulation of millions of Message Passing Interface (MPI) ranks using a lightweight parallel discrete event simulation (PDES) toolkit for performance evaluation. Instead of simply replaying a trace within a simulation, the approach is to generate a benchmark from it and to run this benchmark within a simulation using models to reflect the performance characteristics of future-generation HPC systems. This provides a number of benefits, such as eliminating the data intensive trace replay and enabling simulations at different scales. The presented work utilizes the ScalaTrace tool to generate scalable trace files, the ScalaBenchGen tool to generate the benchmark, and the xSim tool to run the benchmark within a simulation.