• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 178
Next 10 →

Nitro: A Framework for Adaptive Code Variant Tuning

by Saurav Muralidharan, Manu Shantharam, Mary Hall
"... Abstract—Autotuning systems intelligently navigate a search space of possible implementations of a computation to find the implementation(s) that best meets a specific optimization criteria, usually performance. This paper describes Nitro, a programmer-directed autotuning framework that facilitates ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
tuning of code variants, or alternative implementations of the same computation. Nitro provides a library interface that permits programmers to express code variants along with meta-information that aids the system in selecting among the set of variants at run time. Machine learning is employed to build

Empirical Auto-tuning Code Generator for FFT and Trigonometric Transforms

by Ayaz Ali, Lennart Johnsson, Dragan Mirkovic
"... Abstract—We present an automatic, empirically tuned code genenrator for Real/Complex FFT and Trigonometric Transforms. The code generator is part of an adaptive and portable FFT computation framework- UHFFT. Performance portability over varying architectures is achieved by generating highly optimize ..."
Abstract - Add to MetaCart
Abstract—We present an automatic, empirically tuned code genenrator for Real/Complex FFT and Trigonometric Transforms. The code generator is part of an adaptive and portable FFT computation framework- UHFFT. Performance portability over varying architectures is achieved by generating highly

Automatic Performance Tuning of Sparse Matrix Kernels

by Richard Wilson Vuduc , 2003
"... This dissertation presents an automated system to generate highly efficient, platform-adapted implementations of sparse matrix kernels. These computational kernels lie at the heart of diverse applications in scientific computing, engineering, economic modeling, and information retrieval, to name a ..."
Abstract - Cited by 76 (7 self) - Add to MetaCart
like sparse matrix-vector multiply (SpMV) have historically run at 10% or less of peak machine speed on cache-based superscalar architectures. Our implementations of SpMV, automatically tuned using a methodology based on empirical-search, can by contrast achieve up to 31% of peak machine speed, and can

Predictive Performance and Scalability Modeling of a Large-Scale Application

by Darren J. Kerbyson, Hank J. Alme, Adolfy Hoisie, Fabrizio Petrini, Harvey J. Wasserman, Michael Gittings - In Supercomputing 2001
"... In this work we present an analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC's Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against mea ..."
Abstract - Cited by 135 (29 self) - Add to MetaCart
In this work we present an analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC's Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against

Online adaptive code generation and tuning

by Ananta Tiwari, Jeffrey K. Hollingsworth - In Proceedings of the 25th IEEE International Parallel And Distributed Computing Symposium (IPDPS , 2011
"... Abstract—In this paper, we present a runtime compilation and tuning framework for parallel programs. We extend our prior work on our auto-tuner, Active Harmony, for tunable parameters that require code generation (for example, different unroll factors). For such parameters, our auto-tuner generates ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
Abstract—In this paper, we present a runtime compilation and tuning framework for parallel programs. We extend our prior work on our auto-tuner, Active Harmony, for tunable parameters that require code generation (for example, different unroll factors). For such parameters, our auto-tuner generates

An adaptive performance modeling tool for gpu architectures

by Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. Patel, Wen-mei W. Hwu - In PPoPP , 2010
"... This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also ..."
Abstract - Cited by 54 (1 self) - Add to MetaCart
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also

REFACTORING AND AUTOMATED PERFORMANCE TUNING OF COMPUTATIONAL CHEMISTRY APPLICATION CODES

by C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, A. M. Uhrmacher, Shirley Moore
"... Computational chemistry codes such as GAMESS and MPQC have been under development for several years and are constantly evolving to include new science and adapt to new high performance computing (HPC) systems. Our work with these codes has given rise to two needs. One is to refactor the codes so tha ..."
Abstract - Add to MetaCart
Computational chemistry codes such as GAMESS and MPQC have been under development for several years and are constantly evolving to include new science and adapt to new high performance computing (HPC) systems. Our work with these codes has given rise to two needs. One is to refactor the codes so

Tuning the M-coder to improve Dirac’s Entropy Coding

by Hendrik Eeckhaut, Benjamin Schrauwen, Mark Christiaens, Jan Van Campenhout
"... Abstract: The Dirac codec is a new prototype video coding algorithm from BBC R&D based on wavelet technology. Compression-wise the algorithm is broadly competitive with the state-of-the-art video codecs but computational-wise the execution time is currently poor. One of the largest bottlenecks i ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
wisely profiled Dirac’s entropy coding statistics and tuned the initialisation and adaptation parameters the impact on the compression is very limited. Key-Words: Dirac video codec, arithmetic coding, M-coder 1

Automatic Tuning Matrix Multiplication Performance on Graphics Hardware

by Changhao Jiang, Marc Snir - in Proceedings of the Fourteenth International Conference on Parallel Architecture and Compilation Techniques (PACT , 2005
"... In order to utilize the tremendous computing power of grpahics hardware and to automatically adapt to the fast and frequent changes in its architecture and performance characteristics, this paper implements an automatic tuning system to generate high-performance matrix-multiplication implementation ..."
Abstract - Cited by 21 (0 self) - Add to MetaCart
In order to utilize the tremendous computing power of grpahics hardware and to automatically adapt to the fast and frequent changes in its architecture and performance characteristics, this paper implements an automatic tuning system to generate high-performance matrix-multiplication implementation

AUTOMATIC PERFORMANCE TUNING FOR FAST FOURIER TRANSFORMS

by Dragan Mirkovi, Lennart Johnsson, Dragan Mirković, Lennart Johnsson
"... In this paper we discuss architecture-specific performance tuning for fast Fourier transforms (FFTs) implemented in the UHFFT library. The UHFFT library is an adaptive and portable software library for FFTs developed by the authors. We present the optimization methods used at different levels, start ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
In this paper we discuss architecture-specific performance tuning for fast Fourier transforms (FFTs) implemented in the UHFFT library. The UHFFT library is an adaptive and portable software library for FFTs developed by the authors. We present the optimization methods used at different levels
Next 10 →
Results 1 - 10 of 178
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University