Results 1 - 10
of
178
Nitro: A Framework for Adaptive Code Variant Tuning
"... Abstract—Autotuning systems intelligently navigate a search space of possible implementations of a computation to find the implementation(s) that best meets a specific optimization criteria, usually performance. This paper describes Nitro, a programmer-directed autotuning framework that facilitates ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
tuning of code variants, or alternative implementations of the same computation. Nitro provides a library interface that permits programmers to express code variants along with meta-information that aids the system in selecting among the set of variants at run time. Machine learning is employed to build
Empirical Auto-tuning Code Generator for FFT and Trigonometric Transforms
"... Abstract—We present an automatic, empirically tuned code genenrator for Real/Complex FFT and Trigonometric Transforms. The code generator is part of an adaptive and portable FFT computation framework- UHFFT. Performance portability over varying architectures is achieved by generating highly optimize ..."
Abstract
- Add to MetaCart
Abstract—We present an automatic, empirically tuned code genenrator for Real/Complex FFT and Trigonometric Transforms. The code generator is part of an adaptive and portable FFT computation framework- UHFFT. Performance portability over varying architectures is achieved by generating highly
Automatic Performance Tuning of Sparse Matrix Kernels
, 2003
"... This dissertation presents an automated system to generate highly efficient, platform-adapted implementations of sparse matrix kernels. These computational kernels lie at the heart of diverse applications in scientific computing, engineering, economic modeling, and information retrieval, to name a ..."
Abstract
-
Cited by 76 (7 self)
- Add to MetaCart
like sparse matrix-vector multiply (SpMV) have historically run at 10% or less of peak machine speed on cache-based superscalar architectures. Our implementations of SpMV, automatically tuned using a methodology based on empirical-search, can by contrast achieve up to 31% of peak machine speed, and can
Predictive Performance and Scalability Modeling of a Large-Scale Application
- In Supercomputing 2001
"... In this work we present an analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC's Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against mea ..."
Abstract
-
Cited by 135 (29 self)
- Add to MetaCart
In this work we present an analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC's Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against
Online adaptive code generation and tuning
- In Proceedings of the 25th IEEE International Parallel And Distributed Computing Symposium (IPDPS
, 2011
"... Abstract—In this paper, we present a runtime compilation and tuning framework for parallel programs. We extend our prior work on our auto-tuner, Active Harmony, for tunable parameters that require code generation (for example, different unroll factors). For such parameters, our auto-tuner generates ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract—In this paper, we present a runtime compilation and tuning framework for parallel programs. We extend our prior work on our auto-tuner, Active Harmony, for tunable parameters that require code generation (for example, different unroll factors). For such parameters, our auto-tuner generates
An adaptive performance modeling tool for gpu architectures
- In PPoPP
, 2010
"... This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also
REFACTORING AND AUTOMATED PERFORMANCE TUNING OF COMPUTATIONAL CHEMISTRY APPLICATION CODES
"... Computational chemistry codes such as GAMESS and MPQC have been under development for several years and are constantly evolving to include new science and adapt to new high performance computing (HPC) systems. Our work with these codes has given rise to two needs. One is to refactor the codes so tha ..."
Abstract
- Add to MetaCart
Computational chemistry codes such as GAMESS and MPQC have been under development for several years and are constantly evolving to include new science and adapt to new high performance computing (HPC) systems. Our work with these codes has given rise to two needs. One is to refactor the codes so
Tuning the M-coder to improve Dirac’s Entropy Coding
"... Abstract: The Dirac codec is a new prototype video coding algorithm from BBC R&D based on wavelet technology. Compression-wise the algorithm is broadly competitive with the state-of-the-art video codecs but computational-wise the execution time is currently poor. One of the largest bottlenecks i ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
wisely profiled Dirac’s entropy coding statistics and tuned the initialisation and adaptation parameters the impact on the compression is very limited. Key-Words: Dirac video codec, arithmetic coding, M-coder 1
Automatic Tuning Matrix Multiplication Performance on Graphics Hardware
- in Proceedings of the Fourteenth International Conference on Parallel Architecture and Compilation Techniques (PACT
, 2005
"... In order to utilize the tremendous computing power of grpahics hardware and to automatically adapt to the fast and frequent changes in its architecture and performance characteristics, this paper implements an automatic tuning system to generate high-performance matrix-multiplication implementation ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
In order to utilize the tremendous computing power of grpahics hardware and to automatically adapt to the fast and frequent changes in its architecture and performance characteristics, this paper implements an automatic tuning system to generate high-performance matrix-multiplication implementation
AUTOMATIC PERFORMANCE TUNING FOR FAST FOURIER TRANSFORMS
"... In this paper we discuss architecture-specific performance tuning for fast Fourier transforms (FFTs) implemented in the UHFFT library. The UHFFT library is an adaptive and portable software library for FFTs developed by the authors. We present the optimization methods used at different levels, start ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper we discuss architecture-specific performance tuning for fast Fourier transforms (FFTs) implemented in the UHFFT library. The UHFFT library is an adaptive and portable software library for FFTs developed by the authors. We present the optimization methods used at different levels
Results 1 - 10
of
178